ogkalu2 / comic-translate

Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
Apache License 2.0
1.05k stars 95 forks source link

Text settings #117

Closed TaunT closed 1 month ago

TaunT commented 2 months ago

1) outline color 2) line spacing

Sterben1579 commented 2 months ago

@TaunT Hi friend, how are u :)

To be honest, we really need a feature that allows us to interactively adjust the size and position of the text. If you have enough coding knowledge, could I ask you to do this? In the grills, some of the text in the bubbles needs to be large while others need to be small. I'm not getting a good result with the minimum text size feature. The text looks very unattractive and mismatched. Please bring this feature for us.

TaunT commented 2 months ago

@Sterben1579 Hi! Could you attach some screenshots of the bad text you have, so that it would be clearer how to fix it? And the original picture before translation, so that I could try it myself.

p.s. @ogkalu2 Regarding the distance between lines, it was very small = 1, I set it in the code = 2 and it became much better.

notimp commented 2 months ago

I can provide one such example. :)

Seems to be that if the paint in tool fails (too little white), the "biggest rectangle" tool can fail in spectacular fashion. :) In which case a rectangle you can reposition and resize, with text that would resize and autoformat accordingly would be the goat feature. :) (For bonus points, add a feature, that while you resize the rectangle (which auto formats the textblock (hyphenating..., characters per line and thereby amount of lines change) and auto-snaps the text to rectangle size) with the mouse button still pressed down, scrolling the mouses scroll wheel changes font size... :) )

007_translated 007

OG language of the comic is dutch btw.

Btw. fill in settings for the text are still set to center it in the rectangle (default), so its really, that there is too litte white in the textbubble and then biggest rectangle picks some small field of "closest" to white an crams all the text in there, with the text field usually auto extending below, if there is too much text, even for the small fonsize picked. (I've had examples where the text in such a case extends below the boundries of the page/image, and then just is lost - also) I also had this choose the page margins before because it was the biggest amount of near white space the rectangle tool could find near the original text bubble, all kind of fun stuff... ;) But no, never, never the dimensions of the original scramble area the paint in tool identified correctly. :)

Unsure whats easier to fix - biggest rectangle picking a spot close to what the paint in tool identified, or making the manual fixing of the issue a one stop affair by "just resizing the rectangle", but people seem to like a "snap text to box size (reformat and resize)" feature... :)

I understand that the project lead tried to fix this already with yellow text bubbles, but my inkling is, that if you try to cram more colors into the paint in tools notion of what it should do, the false positives of what it will paint over on a page would increase as well, so use that "method" sparingly, probaby. Its not so much that objects on screen will go missing, its that the paint in tool has a tendency to pick (non detailed) faces of characters already, to try to "clean them" - so you wouldnt want to encourage any further behavior like that down the line... :) (Currently it doesnt happen often, but it does once in a while, and it sticks out as "bad performance" much more than misalligned text, or text thats too small - in my opinion. Long story short. Try to still keep that behavior at a minimum.)

For my purposes its fine as is, as I'm not seeking "fit to publish quality" and my cbz reader has a magnifying glass option, but reading a translated comic without it, and having to zoom and then recenter the entire page over and over again, is not a pleasant experience (I did it using a Macbooks zoom screen feature, which zooms the entire screen, which can be considered best in class (very responsive, very "zooms where you want it too", also can be engaged just by holding down 2 buttons on moving the fingers on your trackpad, so designed very well), and still, its tiring). But a comic book reader with a magnifing glass option works as a workaround very well. (Press key to toggle should be supported by it, because having a magnifying glass on screen all the time also isnt a fun experience... :) )

As soon as the comics pages get a little brighter, a little more "white looking", text sizing and placing performance gets better instantly. :) So it seems to be mainly caused by the biggest rectangle tool not chosing the spot that paint in chose - because paint in failed to provide a bubble with a decent amount of white space.

The opposite, so that text can be far too big, can be an issue as well, but in my experience its much less common, and almost always caused by biggest rectangle identifying space thats outside a bubble as fair game - as well.

Here is an example of that occurring as well:

0064_translated 0064

(If clicking on the images gives you 404 errors, so you have problems attaining the original uploads, you can drag them from the posting to your desktop, and they should save. GIthub is Microsoft now, right? wonders... ;) )

notimp commented 2 months ago

Actually... UX wise, make that an intuitive two step process probably.. :)

So user selects rectangle (allow for them to select multiple rectangles at once also), and then scroll mouse wheel (or scroll mouse wheel with ctrl+ as a modifier held down, so it doesnt happen by accident) would change font size in it (/in those), with the rectangle extending around its center point accordingly. (This might be harder than it seems at first glance, btw, because you might run into "different fonts having different linespaces" and "font size means something very different for every font out there (12 isnt 12, isnt 12 ...)" issues. So a check for upper boundries of the first line, right boundries of every line, and lower boundries of the last line to determine rectangle size increase/decrease might be needed to achieve this cleanly.

Then as a second step allow users to freely reposition and resize rectangles, where resizing rectangles will auto change characters by line, auto change number of lines, and auto change hyphenations (the rules for them are different for every language, but just code for english, and ignore peoples complaints... (good enough... ;) )) according to size of the rectangle, while its being resized. You can keep the "auto extent text below the rectangle" as a fail-safe,/fail over behavior, if user wants to do something with the rectangle that the amount of text thats in it doesnt allow for.

Moving both steps into one action would be great to achieve, but then -- its probably overcomplicated for the user, and changing multiple boxes font size in one step (have multiple boxes selected, move scroll wheel to change font size) is probably a good thing to add as well. So seperating out both steps into different actions might be the way to go actually.

Also I'm bad at math and can not code (well shell scripting and googling code snippets from other people and modifying them doesnt count.. ;) ), so have fun with my notion of "excellence". ;)

(edit: If you integrate auto hypehnation, make it optional (checkbox) I dont know how well english hyphenation rules would work with non roman languages (Japanese, Korean, Mandarin, Catonese... ) :) ) )

notimp commented 2 months ago

Also "minimum font size" as chosen as a compensation feature currently - as far as I understand it, will confuse many users... Because what font size is picked the user never sees, and also is solely decided by the size of the initial image, and therefore changes all the time. inventing another relative measure based on vertical and horizontal (two page layout) "standard comic page" sizes (so giving the user a "virtual font size range" that "kind of always stays the same (across comics)") might allow them to make changes once, and then have them auto apply to all comics in a similar fashion. Because as far as I understand it, currently they are not...

But then this also gets complexity up in a stupid way for close to nothing - so... there probably is still a better way to tackle this I'm not thinking about...

I'm just indicating that "minimum font size: 12" (f.e.) is an issue currently, because it can mean an entirely different thing for every comic, based on original image size (dimensions). [Unsure, please correct if not true.] Just as a heads up.

TaunT commented 2 months ago

I take it you are using the automatic mode? Unfortunately, it is not possible to adequately make beautiful text in it. In manual mode you can add hyphens to long words, change the size of the rectangle, and remove the old text as much as possible in several steps. Yes, it is work) one page in manual mode takes me 7-15 minutes, but there is no other way to achieve a good result. Basically, time is spent editing texts after the GPT to make the translation more literary, changing block boundaries, adjusting the text - you have to insert extra spaces, hyphens, and sometimes periods between words - to move a word away from the center.

@notimp Here is an example of a manual translation of your page. I also slightly adjusted the color, contrast and rotation in a simple image viewer.

444

Here, for example, I inserted spaces and periods to move the bottom lines away from the center of the entire text so that they do not touch the block border, because the original block is smaller at the bottom. I can easily erase them later in a graphic editor.

image

Sterben1579 commented 2 months ago

@TaunT As you can see in the image, the text position and size are very disproportionate. They often extend outside the speech bubble or are not centered, resulting in an unsightly appearance. I wish there was an additional section before the 'paste translations' button where we could manually adjust the text position and size as we like. This way, instead of manually adjusting each one in manual mode, we could switch to automatic mode, then adjust and save the final positions and sizes.

Today, I translated a webtoon, and it took me about 30-40 minutes. I wonder if what I'm asking for is too difficult to implement?

ab

TaunT commented 2 months ago

@Sterben1579 There was a humorous monologue by one artist, which, I think, will be understandable to a resident of any country - do you need it well or quickly? ) I understand the temptation, I also tried to translate entire chapters of comics of 20 pages in automatic mode, but, as it seems to me, it is unrealistic to do it well, no matter how hard you try.

I did not quite understand about switching, do you want the program to pause after the translation stage on each page to correct the blocks, and then start again? How will this differ from the manual mode, where you just press 3-4 buttons manually before doing this? In addition, correcting the blocks once is not enough, I usually try to change the size and position of the blocks from 3 to 10 times on one page.

In addition, at earlier stages, you sometimes have to delete unnecessary automatically created blocks or add those that were not created automatically, move the boundaries of overlapping blocks, erase old markers at the cleaning stage or draw where the text is not painted over automatically. In general, it seems to me that it will not be possible to make a 100% ideal genius automatic mode. The option to first run automatically and then manually only some pages did not suit me, because out of 20 pages, 18-19 had to be completely re-done and this is a waste of money using translation API.

But I understand that in manga and similar black and white comics, it can be easier... But not ideal. Specifically on your page above, have you tried to reduce the maximum font size? Can you give me the original manga picture, I will try to select the parameters so that the first automatic time it looks not so bad?

TaunT commented 2 months ago

@Sterben1579 Also, I remembered this comment https://github.com/ogkalu2/comic-translate/issues/94#issuecomment-2269806837 , you can enable hyphenation of long words, but I don't know how good it is, I haven't tried it myself. It is clear that in Japanese or Chinese, hieroglyphs take up little space and in another language there will be long words. This further complicates fully automatic translation. But you can try)

notimp commented 2 months ago

@TaunT No problem, I'm fine with the results as is. :) And yes, I was using automatic mode.

Also I strongly expect, that simply brightening the images "fixed" the auto rectangle placement in those images - as again, from my experience, as soon as the images get a little brighter (less yellow) I get less of the "missaligned and small text" issues. :) (Also dont try to compensate for that in the algo, I'm worried, that that might lead to more false positives on faces that get cleaned down the road.). :)

I was just providing example images. :)

Sterben1579 commented 2 months ago

@TaunT @ogkalu2

Here's the translation of the provided text into English:


Sorry for the inconvenience, but even though I understand what you wrote, I need to use translation to communicate. As a result, my sentences might lose some meaning.

Let me explain the idea I have in detail. First of all, our biggest issues are: "translation quality, text positioning, and text size," right? To quickly and easily address this issue, I have an idea. The method I'll discuss is intended to make the automatic mode more practical, but it also works well in manual mode.

What I suggest is adding a new step between the cleaning and text pasting stages. This new step would be a quality control stage. Before the translated texts are applied to the image, we should see the texts on the image in separate blocks, similar to the text addition feature in Photoshop. When we click on these text blocks, we should be able to make adjustments like "changing text content, repositioning by dragging, resizing text, and altering the text area," and most importantly, seeing this live. Later, when we click the paste translation stage, the texts will be applied to the image with their position, size, and area information.

Let me explain the advantage this provides in automatic mode, step by step:

  1. We upload the images to the program.
  2. We start the automatic mode, and it begins translating.
  3. The program will pause at the quality control stage for each image and then move to the next image.
  4. The automatic mode completes the translation.
  5. We review all images at the quality control stage and make corrections for any incorrect translations and text adjustments.
  6. There should be a "apply text to image" button somewhere in the program, which we will press. The program will process the images one by one with the adjustments we made.

This way, we will achieve a perfect quality comic book chapter by dedicating only 20 minutes to a single section.

This request might have seemed difficult at first, but actually, the only critical point is creating those text blocks. If we can achieve that, the rest will come easily since the program's structure is already suitable for this purpose.

Please feel free to ask any questions you may have. Let’s develop this program together.

TaunT commented 2 months ago

@Sterben1579 "the only critical point is creating those text blocks." What's so difficult about it? Text blocks are created anyway, you need to make an option "pause before moving to a new image", and if it is selected, then pause and save some flag that the program will check every second, and the "Play Next" button, which removes this flag and the program goes on. But this development takes time anyway

And I still don't see the point in this mode)

You will just sit in front of the monitor and wait 3-4 minutes until the program pauses. move blocks, press next. The point of the auto mode is to turn it on and go for a walk outside, or turn it on at night and go to bed. If you sit at the computer without stopping, then there is no difference with the manual mode - uploaded an image, pressed Identify, pressed Recognize, pressed Translations, pressed Segment and Clear - move blocks.

@ogkalu2 mode should look different - uploaded 20 images, pressed the Start button and left for an hour. And the program remembered the state for each image before saving the result. You came back, looked at them all in turn, made corrections on each and then pressed the final button "Everything is great" and everything was saved in its entirety. But this is already more difficult to do.

p.s. Please forgive me if this is an immodest question - what country are you from, comrade? :) I ask because you were afraid that the translation of your words might be unclear.

notimp commented 2 months ago

Yes, step or "feature" should be applicable as a quality control step after an automatic pass, if possible. So bulk auto can run automatically, but then you can dial back in and make positional and text size changes, as well as changes to the text after the fact (also add a visibility feature that makes it possible to look at the initial image via a toggle while editing probably, also to the initial ocr'ed text). If you choose to tackle and implement it one day.

But what several people in here thought up in "we want" hypoteticals ( ;) ), seems very similar in concept, not much thats contradictory in here so far.. :) Just dont pause progress but allow the quality control step later on as a last step, and make it optional.

If you tackle it at all.

The option to do a final manual pass (optional) with text edit also as an option, after an entire auto pass is probably what people would like most, because it would require from them the least amount of clicks, the least amount of waiting in front of the machine, they would already see almost all filler text in a language they understand, ...

But I understand that this would be a huge undertaking, and the program currently isnt designed with that concept in mind. And might never be.

TLDR; I agree.

TaunT commented 2 months ago

Yes, I looked and it seems that now in the auto mode, changes to the history are not created with the preservation of the stage of the cleared image and blocks with text. @ogkalu2 If these two things are added, then after a complete run of all the images, we can click on any, press the "Undo image" button and correct the translation and blocks?

TaunT commented 2 months ago

@Sterben1579 @notimp @ogkalu2 I did it https://github.com/ogkalu2/comic-translate/pull/121 I wrote instructions for use there too

notimp commented 2 months ago

Yay. :) Thank you so much. Will test it tomorrow.

TaunT commented 2 months ago

@notimp I apologize, I checked it now and found that for some reason adding a rectangle doesn't always work, I'll look into it some more. But the main functionality with edits after the auto-mode can be tested.

Sterben1579 commented 2 months ago

@TaunT @notimp can u add me on discord? I would like to consult you on a few matters.

Username: Barbunay

ogkalu2 commented 2 months ago

@notimp I apologize, I checked it now and found that for some reason adding a rectangle doesn't always work, I'll look into it some more. But the main functionality with edits after the auto-mode can be tested.

@TaunT It's because the update_blk_list function is not well suited for constant/incremental editing of blks. It's meant for a "Ok i've made all the changes i want, let's update everything". I've depreciated it and just use Signals now to alter the blk_list directly every time a blk is created or deleted.

Sterben1579 commented 2 months ago

@ogkalu2 can u check the new issue.

TaunT commented 2 months ago

@Sterben1579 This is not a new problem, it has been there from the very beginning - rectangles that you draw or delete are not saved unless you press OCR

TaunT commented 2 months ago

@Sterben1579

can u add me on discord? I would like to consult you on a few matters. Username: Barbunay

I sent a friend request, but I rarely go to Discord, so I won't be able to answer quickly, sorry