Separate bounding boxes for parts same character

Optical character recognition is a challenging problem and the code was written with the aim of playing around with simple techniques rather than being a top-of-the-range classifier. At a guess I'd say the lower characters are being separated into different boxes because of the slight white space between them after thresholding. Several ways to overcome this could include:

Increasing the amount of blur in the x-axis and lowering the threshold
- This way the characters would appear thicker in the 'Otsu Adaptive Thresholding' step
Merging bounding boxes if they are very close together
Adding padding to the edge of the bounding boxes

I hope this answers your question and that with some experimentation you are able to improve the performance of your classifier!

TomMakesThings / Sinhala-Optical-Character-Recognition

Separate bounding boxes for parts same character #1