zyddnys / manga-image-translator

Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
https://cotrans.touhou.ai/
GNU General Public License v3.0
5.31k stars 547 forks source link

[Bug]: Textbox detection seems to be performing poorly with ctd #710

Open torgabor opened 1 month ago

torgabor commented 1 month ago

Issue

Hi! I've tried this program but it seems that the textbox detection using ctd is very unreliable, working on like 10-50% of the text boxes.

The strange thing is that the hosted version (https://cotrans.touhou.ai/) seems much better, and BalloonsTranslator (https://github.com/dmMaze/BallonsTranslator), which is supposedly based on this project is also working well.

In all cases, I'm using the ctd detector with detection size 1024. I'm using the mit-48px for ocr (where I can specify it).

Does anyone have an idea what could be the difference between these approaches? I've tried this with commit 37bb4cdcf6d31a447af8ede4429201e796290af7

Command Line Arguments

This is the one I used last, but I tried many combinations of arguments, with no success: python -m manga_translator --translator=none --force-horizontal -l=ENG --detector=ctd --detection-size=1024 --ocr=48px --overwrite --use-gpu --save-text --inpainter none -i somepage.jpg

No response

Console logs

No response

grassmedico commented 1 month ago

Do the upscaling thing also rescale it back if you dont want the resolution became to big

zyddnys commented 1 month ago

use higher detection resolution with default detector like --detection-size=2048 --detector default