impactcentre / ocrevalUAtion

OCR evaluation brought to you by University of Alicante
Apache License 2.0
66 stars 27 forks source link

Question about future plans #10

Closed Serge-Smirnov closed 10 years ago

Serge-Smirnov commented 10 years ago

Hello. First of all I want to thank you for this project. I see that the project continues to grow and evolve. Is there a roadmap for this project? What are your next plans?

P.S. I work in Russia, St. Petersburg. Now i develop a system for digitization documents of state archives. That's why we are very interested in ocr evaluation tools.

rccarrasco commented 10 years ago

Dear Serge, I am now working on some features (such as options to ignore case or diacritics) after a request by people at the BnF (National library of France). We are collecting the experience from people at digitization departments: thanks to the Succeed project (http://succeed-project.eu), the British Libray and the University Libray of Leuven will also ontribute. The project started as a (simple) challenge but now is growing. If positive feedback is received, we may keep it evolving. Possible continuations, once we have an acceptable evaluation tool, are: -create tools to evaluate OCR when little or no ground-truth text is affordable -create code to improve OCR results based on the evaluation results (e.g. simplifying training of some popular engines, such as Tesseract) I will appreciate all feedback form the community. Best