Open wanghaisheng opened 8 years ago
@rakeshvar It would be great if you can list the steps needs to followed to extended banti to other languages.
@wanghaisheng You might have a lot of implementations of Chinese OCR elsewhere on the web. It is a problem that has received much more attention than the Indian language OCRs. But if you want to follow along the same lines. Here is a brief outline.
@ChillarAnand I am not sure how good the banti framework is for extension. It can be, there is no doubt. I am thinking of the chamanti framework which is much more easy to extend. You might be interested in working on that. I can post guidelines for that.
What do you think is the best way to make this collaborative with minimal amount of work from my side (I really can not spend much time on these things). A github.io page ? A google group? Ideally there will be a post, and a scope for discussions and questions. Please do suggest. Thanks.
@rakeshvar Should we use Github issue tracker itself for discussion?
a blog post would be best
i have a lot of xps/pdf file which can transform to jpeg files, 1.do i need to generate millions of chinese characters like your " datagen_initio " 2.what about font and encoding for chinese Character "Mallicodes" 3.do i need to prepare box files generated by antanci_segmenter /OCR Segmenter