Open TaunT opened 1 month ago
Are you on default when you do this ?
Was in my branch. But now I switched to the main one and checked again. If the language in the settings is English - it recognizes worse If French - perfect)
No what I mean is, what OCR option do you have in settings when this happens? Default, Microsoft or Google ?
@ogkalu2
No what I mean is, what OCR option do you have in settings when this happens? Default, Microsoft or Google ? I didn't understand at first) Yes, it is Default in the settings
Yes that's it. Pororo also supports English. I think it's better also. Maybe I should switch to that for Default ?
Can you set your source language to Korean and see if the English results are better than EasyOCR ?
in "Korean" mode some words were not recognized and replaced with hieroglyphs.
It is necessary to try to change the default model in the code.
By the way, with GPT there is also fix a problem of intersecting blocks
@ogkalu2 Would it be nice to add the choice of GPT to the OCR settings?
in "Korean" mode some words were not recognized and replaced with hieroglyphs.
It is necessary to try to change the default model in the code.
By the way, with GPT there is also fix a problem of intersecting blocks
In modules/ocr, line 203, change self.pororo_cache = PororoOcr() to self.pororo_cache = PororoOcr(lang='en') and see if it's better(still keep the source lang a Korean in the UI)
What do you mean by the problem of intersecting blocks ?
@ogkalu2 Would it be nice to add the choice of GPT to the OCR settings?
Might do this yeah.
This is probably a life hack =)
I usually translate comics from English, and there were always a lot of mistakes when digitizing. Yesterday I translated a page from French and forgot to switch back, today I started digitizing from English and everything went perfectly, not a single mistake.
I checked - the result is constant. If you select French when digitizing English text, the result is much better!)