CypherousSkies / reading-for-listeners

A deep-learning powered accessibility application which turns pdfs into audio files. Featuring ocr improvement and tts with inflection!
GNU Affero General Public License v3.0
23 stars 3 forks source link

Remove Headers and Footers #9

Open CypherousSkies opened 2 years ago

CypherousSkies commented 2 years ago

It's hard to do this with regexs, so it might be worth using openCV (as per this post, which functionally means building an in-house version of ocrmypdf Definitely a long-term project, but might be a fun diversion while I wait for answers on TTS

CypherousSkies commented 2 years ago

This would also be fixed by #13

CypherousSkies commented 2 years ago

in the near-term, unpaper does have a clear boarders filter, so opencv can be used to get the sizes for that!

CypherousSkies commented 2 years ago

Moved to 0.1.0 as I plan to solve this using Server UI