Figure out OCR performance

SuffolkLITLab / docassemble-MotionToStayEviction

A Docassemble interview for the Massachusetts Appeals Court Motion to Stay Eviction

MIT License

0 stars 0 forks source link

Some compromises:

upstream in AssemblyLine, we've added OCRmyPDF, which skips pages without text already on them. This lets people upload PDFs that are very large, but they shouldn't be uploading many pages that don't have text already on them (they can upload pictures, but it has to be one at a time, and (hopefully) isn't 100's of pictures).
things do still take a long time sometimes, but it's overall okay, and isn't retriggered when you go to the next page (unlike as_pdf).
users now will wait before e-filing the forms, which wasn't happening before.

Nothing much else to do here. I'll close this with #61, but in general, we should keep an eye out for other ways to improve OCR and PDF performance.

SuffolkLITLab / docassemble-MotionToStayEviction