tesseract-ocr / tessdata

Trained models with fast variant of the "best" LSTM models + legacy models
Apache License 2.0
6.45k stars 2.2k forks source link

Which font is used for Bengali tessdata? #167

Closed debdip closed 1 year ago

debdip commented 2 years ago

Hello I installed tesseract perfectly with bengali traindata but the output is not as expected. It seemed the output characters are not unicode bengali characters. Can anyone tell which font type is this?

I used windows 10 command prompt and command : "tesseract img2.jpg Bengali"

img2

and the output is this!!!!

Serate fran atemtora AHH Sane 6 aeaerla aay aera Fiftice era wreTE x, Sa MA ASST TAG AA AP (SHAAA) STATA 00.50.2028 BE AE 04.30.2028 TAT aid ora frema wry Catiee Biba iam Stee | Sa ATT ce ware Re at aT ara Se via fry wate efberain ace Is facies feed eee are SheTs ATER ea GTS ATER |

STSTRM, AMT AIRE CTA FAPOS als MCAT SAT 00.30.2023 Fi AWE 0,90, 20222 one et SASS) ATE aR ATA (GRETA) ex BB MAR aa TR Facsow Pea aE ‘gare iets aTEY seal GAT APS SecA aT S2T |

stweil commented 1 year ago

You might try tesseract https://user-images.githubusercontent.com/11462610/192587779-ea217209-6769-4681-b578-717ccf341a61.jpg - -l ben.

And please use the Tesseract user forum for questions.