jlsutherland / doc2text

Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.
MIT License
1.27k stars 97 forks source link