TomMakesThings / Sinhala-Optical-Character-Recognition

An AI project in which I created a basic optical character recognition system to convert images of printed Sinhalese characters into text using a KNN classifier
8 stars 3 forks source link

Separate bounding boxes for parts same character #1

Open uvindub opened 1 year ago

uvindub commented 1 year ago

I got separate bounding boxes for parts of same character download how can i overcome this issue

TomMakesThings commented 1 year ago

Optical character recognition is a challenging problem and the code was written with the aim of playing around with simple techniques rather than being a top-of-the-range classifier. At a guess I'd say the lower characters are being separated into different boxes because of the slight white space between them after thresholding. Several ways to overcome this could include:

I hope this answers your question and that with some experimentation you are able to improve the performance of your classifier!