Closed itsdapolice closed 3 weeks ago
please provide a failure example and your arguments
Hi,
This is the command I'm using:
python -m manga_translator -l ENG --translator=offline --font-size-minimum=15 --font-path _fontpath_/KOMIKASL.ttf --mask-dilation-offset=5 -f jpg --detector=ctd --inpainter=lama_mpe -i "_inputFolder_" --manga2eng
fontpath and inputFolder are real paths, of course. Below are the input and output images. You can see the text on the lower left panel is badly translated, with the OCR failing to recognise the text as being horizontal.
Hopefully this helps.
From what I can see in the several cases I came across, it seems the OCR recognizes horizontal, multi-line phrases as if each line was a column of vertical text orientation - including trying to interpret the kanji rotated, which is why the translation is complete rubbish.
I say this because the same doesn't happen if there's just a single line. If it is just a single, horizontal line, the OCR properly detects it as being on the horizontal orientation and detects correctly the characters.
I understand that there's a need to cater for vertical aligned text (being the majority of it, of course), but maybe the OCR code can be improved so that, if the text rotation is more than 45 degrees (with zero degrees being the text being perfectly vertical), it is to be detected as horizontal.
Maybe this can help. I've ran it on demo mode, obtaining the intermediates. You can see that the bbox in question (5) is wrongly detected as being rotated 74.26 degrees (since it was detected as a vertical aligned text heavily rotated), but another bbox with the same orientation (3) is correctly detected as being rotated only -9.93 degrees (i.e. properly detected as horizontal text, with a small rotation).
@dmMaze I think this may be more on your area? I think the code for the Comic Text Detector is yours, isn't it?
Checking the code, the issue doesn't seem to be on the OCR step, but on the textbox detection. The text box is not on the right orientation, so it passes the detected text on the vertical to the OCR.
@itsdapolice It's some text-detection postprocessing or ocr preprocessing I didn't write. Ballonstranslator can do it right:
I'll take my time to investigate why it's wrong in manga-image-translator
@itsdapolice It's some text-detection postprocessing or ocr preprocessing I didn't write. Ballonstranslator can do it right:
I'll take my time to investigate why it's wrong in manga-image-translator
Thanks, mate. I did try to wrap my head around the code, but I'm way too unfamiliar with it to be of any help.
I've tested the new commit and works like a charm. Thanks, @dmMaze .
Issue
It's pretty common nowadays to has Japanese text expressed in horizontal orientation mixed with other texts in the usual vertical orientation. The most common occurrence of this is when a phone with text messages is displayed.
Currently, the OCR (whatever it might be - be it ctd or manga_ocr) ends up failing to correctly detect the horizontal text (maybe failing is too strong of a word - it detects the characters, but it doesn't detect them as a sentence). The detected horizontal text ends up being translated to gibberish, degrading the whole experience.
I'm not sure how easy it would be to fix this - maybe adding a 2 pass OCR detection, with one pass detecting normal vertical text and a second pass horizontal?
Command Line Arguments
No response
Console logs
No response