siphyshu / vitb-timetable-parser

🔎 Parse VITB timetable screenshots to csv/json
https://vitb-tt2json.streamlit.app/
0 stars 0 forks source link

Consider alternatives to tesseract-ocr #3

Closed siphyshu closed 4 months ago

siphyshu commented 5 months ago

Research and test alternatives to tesseract-ocr

The setup needed for tesseract is a bit too tedious, as the steps involve downloading the binary and setting it in the PATH, to be deployed easily as a web-app like streamlit for example.

If there was a way to simply import a library and get 98% accurate results, it would be a perfect fit.

siphyshu commented 4 months ago

I think this discussion is not needed anymore. At the time of opening this issue, I was not aware that it's relatively simple to use tesseract-ocr on linux. As in, it's just a sudo apt install tesseract-ocr. Incidentally, it also becomes easy to set up on cloud platforms. Hence, tesseract works for now.