badonhill-io / randeli

Augment PDFs and e-books to aid neurodivergent (ADD/ADHD) reading
GNU General Public License v2.0
1 stars 1 forks source link

Short term workaround to handle documents using CID fonts (i.e. XeLaTeX) #2

Closed nhi-vanye closed 1 year ago

nhi-vanye commented 1 year ago

Add --force-ocr to ignore processing of text elements on a page and put the entire page through OCR.

On a per-use basis, may need to modify policy.strong_box_height to reduce overlapping boxes (use 0.6 in run-samples.sh)

Update the XeLaTeX augmented samples to show this new mode