Closed notimp closed 2 weeks ago
Here are the unedited images for you to test on. Source language is Dutch (Netherlands) target language used by me was German.
In the config panel, Typesetting, uncheck Autolayout and there will be no extra line breaks inserted into the translated text. Besides you can write regexper in Titlebar->Edit->Keyword subtition for OCR & translated text.
In the config panel, Typesetting, uncheck Autolayout and there will be no extra line breaks inserted into the translated text. Besides you can write regexper in Titlebar->Edit->Keyword subtition for OCR & translated text.
Thank you very much. I'll close the ticket once I had the chance to confim it. :)
Thank you!
Unchecking that checkbox solved all my problems. Thank you. I'm closing this ticket now.
Commented in the wrong issue, so I'll post it here:
The automatic mode should perform better now, the original algorithm was mostly for the scenario of manga. You may want to try it with that checkbox checked and set the font size to use the global setting. Tune the line spacing may also affect the results.
Google lens OCR is one of the best free OCR options out there, Google translate is also one of the most popular free translators used by many in here but if you use it (even with the "handling newline" setting turned to remove in the Google lens module), there will still be linebreaks in every recognized text bubble - which are fine if you dont plan on doing a manual quality assurance pass (editing all text in the comic), but become the most time intensive step as you have to remove them manually during a quality control pass, if you plan on reflowing the text afterwards. (Using a different font size, or changing the size of the rectangle, both count as reflowing the text. :) )
Please add an option for a text parser to remove every non breaking space (\s) and every newline (\n) from all text in one text bubble (recognized and handed forward as "one text box" by the ocr, and translated as "one text box" by the translator module) and replace it with one space exactly. So all the text we get back in the translation box for one text bubble in a comic by default will be an endless line, and only limited by the text boxes boundaries.
Here are some examples to illustrate the issue.
All images are "as google lens OCR, google translate and BallonsTranslator filled out those bubbles in auto mode, after hitting run". None of the results were modified manually.
In all the examples in this image, you see the same behavior, after the first word, there is a newline break in the text.
If you plan on reflowing those bubbles (changing their size, or their font size) you have to always remove that newline break manually, as it still will be honored after reflowing the text.
Here is a more outrageous example:
So what I'm proposing is this: Give us an optional feature that parses the text and removes all newline breaks (soft and hard), and replaces them with one space each, for the text in every recognized speachbubble.
So this:
which will give the following result:
==
-- so that the text within a text bubble will be limited by the boundries of the the textbox only, and not by linebreaks that were in the text already.
Make it available as an optional feature (many people will prefer the current behavior, as it gives you better results if you dont plan on doing a quality control pass afterwards).
Having such an optional feature would make reflowing text, much less time consuming in an optional manual quality control pass afterwards, where you'll be touching close to every textbox and resizing it anyhow..
I'm currently unsure if google lens (unlikely, see textboxes), or google translate add those newlines (new line characters), or its actually the way BallonsTranslator handles reflowing the text, please make it an optional feature to not have those linebreaks occur though.
If you know if google lens, or google translator, or BallonsTranslator itself add those newline characters, please tell us (/me).
Any help would be appreciated. :)
Thank you,
notimp
edit: I'll also post the uncleaned test images, give me a sec.