Open shreevatsa opened 5 months ago
The reason for starting with OCR instead of a blank editor was to avoid having to deal with splitting already added pages. But it may be useful to instead add one page at a time (could choose OCR option on a per-page basis). Would also be better than firing off hundreds of requests to the Google Vision API.
Possible UI:
Example of a modal div: https://chatgpt.com/c/fd4eae00-9c84-4e0d-bb82-06bc9873207a modal.zip
After chunk split (either LR or UD) is decided, we need to:
xmin
and xmax
) to the doc.Currently the schema already has lines' xmin
and xmax
, so in principle no schema changes are needed, although maintaining words
is a bit annoying (#21).
We need some code refactoring to be able to do (2) easily.
Will likely need this