Open jzohrab opened 6 months ago
No longer blocked.
This is slightly more complicated than the hacky code marked with the TODO
, or the language_term_export.py
thing.
The current hacky code doesn't include multiword terms. For languages like classical chinese, that's important.
I think that what needs to happen is an in-memory "render" of each page, something like read.service.start_reading
-- but without saving all of the status 0 terms. The resulting paragraphs will contain all of the text tokens, including net new ones (not saved) and saved status 0 ones, and all the rest, of course.
The test cases for this are pretty easy, even if the code isn't:
Since this is long-running, may need to have some kind of WebSockets to report back to the client.
Some good interim progress. Hacked at the language term export job quite a lot, and added a new book_term_export <bookid> <filename>
cli job, e.g.:
flask --app lute.app_factory cli book_term_export 432 sp_terms.csv
This is a bit slower than the old job, b/c it essentially does the calculations for a full page render for each page. It feels like it should be faster, but whatever.
This can't be added to the "actions" dropdown, b/c it doesn't communicate well back to the client. The job just prints to the command line, but when clicked from the web ui the job should really communicate back via a web socket, and then download the file at the end. Since the job is slow-ish, the user should be notified what's happening.
Blocked by #316- this is done nowThe parent mapping export used to have a thing to export all unknown terms. That could be useful for loading up vocab lists for books.
The code has some
TODO issue_336_export_unknown_book_terms
markers for things that should be used for this.UPDATE: Lute has a CLI job to export book terms -- see the comment below for notes about what's needed to make this a book action callable from the UI.
As part of this work, any code with
TODO issue_336_export_unknown_book_terms
should be removed, as I don't think it's used anymore.