Amr-Aboshama / XGeN

An automated Exam Generator using Natural Language Processing Techniques.
1 stars 0 forks source link

OCR needs to define a minimum text size on the pdf #1

Closed AymanAzzam closed 3 years ago

AymanAzzam commented 3 years ago

when the text size is so small as in references, the OCR can't detect the spaces that between characters.

AymanAzzam commented 3 years ago

We will use word segmentation.

Recommended Solution: We will pick random pages then check if the average word size > 6, do word segmentation.