katanaml / sparrow

Data processing with ML, LLM and Vision LLM
https://katanaml.io
GNU General Public License v3.0
3.73k stars 379 forks source link

Add support multi-languages #59

Closed dinhquangsonimip closed 4 months ago

dinhquangsonimip commented 4 months ago

I add more langs parameter for support multi-languages

To run with more languags please add more tesseract .traineddata file for the language to tessdata folder In Windows, find .traineddata at [https://github.com/tesseract-ocr/tessdata] In Linux just command line, for france: sudo apt install tesseract-ocr-fra