Closed piscopancer closed 7 months ago
Handwritten text is not supported by Tesseract. The Tesseract OCR model is built around assumptions that only hold for printed text. No combination of options will significantly improve performance with handwritten text. Unless your handwriting is so good that it closely resembles printed text, the results will be poor.
@Balearica understood. I will keep it in my project anyway BCS I need to let users scan images. Can you recommend a library for handwritten recognition? I have learned about handwriting.js but not only it is just a GitHub only project and I it was never uploaded on npm, it also uses outdated javascript and api is painful, no typescript either. Also is it possible to feed tesseract with a different dataset that was compiled with photos of handwritten Chinese characters? will this work?
@piscopancer I do not personally know of any libraries for recognizing handwritten text.
It may be possible to improve results using different language data, however I don't know of any existing language data that does this, and am not overly optimistic that language data could be created. You can try searching the main Tesseract git issues or user forum for past discussion. Handwriting gets brought up from time to time, and to the best of my knowledge, nobody has ever claimed they can make it work with high accuracy.
tesseract 5.0.5
I use a canvas and feed it to a worker. it succeeds recognizing handwritten chinese characters 1 in 4 times, in other words, it works but i expected more. I may have misconfigured my tesseract worker or what if tesseract is not trained to work with handwritten characters and I just do not know it?
Please tell me how to improve the inference
This is my code (nextjs, typescript)