Open cilynx opened 2 years ago
Even worse, the GUI freezes up on scans after original.tiff
drops if processed.pdf
takes a long time to create. Under Gnome, this is throwing the wait-or-kill dialog.
As we get into more advanced text recognition, we need to think about serial vs parallel processing. How should content on other pages in the same document impact interpretation of any given page? As an extreme example, think about tables that span multiple pages. More nuanced, context from other pages may improve recognition on any given page.
Scans with dozens of pages take forever to process and don't even start processing until the entire document has been scanned. We should pipeline this stuff so page 1 starts processing as soon as it shows up. If processing one page takes longer than the time to scan the next one, we should multi-process so we can start page 2 while page 1 is still finishing up. Probably want to do a capped thread pool to keep this from getting out of hand.