Add "export unknown terms" (or "export all terms and statuses") action to Book actions #336

Open jzohrab opened 6 months ago

jzohrab commented 6 months ago

Blocked by #316 - this is done now

The parent mapping export used to have a thing to export all unknown terms. That could be useful for loading up vocab lists for books.

The code has some TODO issue_336_export_unknown_book_terms markers for things that should be used for this.

UPDATE: Lute has a CLI job to export book terms -- see the comment below for notes about what's needed to make this a book action callable from the UI.

As part of this work, any code with TODO issue_336_export_unknown_book_terms should be removed, as I don't think it's used anymore.

jzohrab commented 6 months ago

No longer blocked.

jzohrab commented 3 months ago

This is slightly more complicated than the hacky code marked with the TODO, or the thing.

The current hacky code doesn't include multiword terms. For languages like classical chinese, that's important.

I think that what needs to happen is an in-memory "render" of each page, something like read.service.start_reading -- but without saving all of the status 0 terms. The resulting paragraphs will contain all of the text tokens, including net new ones (not saved) and saved status 0 ones, and all the rest, of course.

The test cases for this are pretty easy, even if the code isn't:

Since this is long-running, may need to have some kind of WebSockets to report back to the client.

jzohrab commented 3 months ago

Some good interim progress. Hacked at the language term export job quite a lot, and added a new book_term_export <bookid> <filename> cli job, e.g.:

flask --app lute.app_factory cli book_term_export 432 sp_terms.csv

This is a bit slower than the old job, b/c it essentially does the calculations for a full page render for each page. It feels like it should be faster, but whatever.

This can't be added to the "actions" dropdown, b/c it doesn't communicate well back to the client. The job just prints to the command line, but when clicked from the web ui the job should really communicate back via a web socket, and then download the file at the end. Since the job is slow-ish, the user should be notified what's happening.