zyddnys / manga-image-translator

Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
https://cotrans.touhou.ai/
GNU General Public License v3.0
5.32k stars 548 forks source link

[Feature Request]: Improve quality of translation by translating whole chapter's texts in a bulk (as a possible option) #530

Open Grachy opened 11 months ago

Grachy commented 11 months ago

What would your feature do?

Translating all texts of each chapter in a bulk:

It still will improve translation quality ALOT, if we have all text from ONE PAGE translated in a bulk right now, we need another possible option to translate all texts from a chapter in a bulk the same way as we do now with one page.

zyddnys commented 11 months ago

we already use all texts in a page in a single translation api call

Grachy commented 11 months ago

we already use all texts in a page in a single translation api call

My bad, than it is weird that it translates so weirdly as if each bubble is a different entity alltogether. I assume there is an issue somewhere.

Either some translation AI compehends bubbles as a separate entities, or something else, because gpt4 gets it perfectly, maybe we separate each bubble in such a way?

zyddnys commented 11 months ago

which api are you using?

Grachy commented 11 months ago

which api are you using?

I used google, gpt3.5, gpt4 and deepl

BigEmperor26 commented 11 months ago

I get what you mean. You can technically do something like this, particularly on offline models, and it does indeed improve translation quality.

However if you merge all text bubbles together, you get only one output translation. There is no trivial way to re split the single output translation to the original text bubbles, placing each text part into the original part.

You could add a special token or character as demarker that is not used in any language and therefore it will not be translated, to detect and keep track of the parts that compose each text bubble.

However for some languages the translation will change the position of parts of the text, resulting in misplaced text bubbles.

It might be interesting for languages that share the Subject Verb Object order. https://en.m.wikipedia.org/wiki/Subject%E2%80%93verb%E2%80%93object_word_order#:~:text=In%20linguistic%20typology%2C%20subject%E2%80%93verb,second%2C%20and%20the%20object%20third.

Nice suggestion overall

Grachy commented 10 months ago

It still will improve translation quality ALOT, if we have all text from ONE PAGE translated in a bulk, we need another option to translate all texts from a chapter in a bulk the same way as we do now with one page.

Grachy commented 10 months ago

we already use all texts in a page in a single translation api call

updated issue