Offline Balloon Detection

xulihang / ImageTrans-docs

Documentation of ImageTrans, a computer-aided image translation tool. ImageTrans的文档项目。ImageTrans是一款计算机辅助图片/漫画翻译软件。

https://imagetrans.readthedocs.io/

87 stars 9 forks source link

Offline Balloon Detection #24

Closed HeLan8 closed 3 years ago

HeLan8 commented 3 years ago

Does the offline balloon detection require to set up the api url in the settings or do I have to do something else for it. When I check the box it tells me: Model is not placed correctly. I'm not sure how I have to set up the offline detection

xulihang commented 3 years ago

The detection model has to be trained using DarkNet or TensorFlow Object Detection API. There is a pretrained one you can find here: https://github.com/xulihang/ObjectDetector/releases/download/models/yolo_darknet.zip. Unzip it to ImageTrans's folder and check the offline checkbox.

HeLan8 commented 3 years ago

Thank you, works well. I find balloon detection to be the best way to create automatic text areas, heuristic for some reason often creates several text areas within one bubble which makes it kind of useless.

xulihang commented 3 years ago

heuristic for some reason often creates several text areas within one bubble

You have to adjust its params like height span and width span.

HeLan8 commented 3 years ago

You're right. I've increased the number and it does work properly now. Though I do wonder why did you set those numbers as default? It never worked for whatever I used, so I wonder if it's fine for other people

xulihang commented 3 years ago

I've talked about this in blog and docs. The method is a bit complex to explain. Maybe I should emphasize these in a suitable place

HeLan8 commented 3 years ago

I understand now, it's a great thing that heuristic enables that much customization. I'd say the default is better for when you make custom translation on your own, but you do advertise your program in some places as an automatic manga/comic translation tool, so if someone buys the program for that purpose I'd say having that kind of default would counterproductive

xulihang commented 3 years ago

That's right. I'd better make the program be able to determine the params by itself.

HeLan8 commented 3 years ago

Here is what I found to be the best method to create automatic text areas with the highest accuracy. I first do text area detection heuristic with text area confidence. After that I remove low confidence areas for all pictures, then I remove non-text areas with azure ocr (which seems to be the best option on this operation). Removing non-text areas gets rid of both text areas with no text and sometimes of text areas that have text in it. So after that I use the offline balloon detection to get text areas for balloons that either haven't been detected by heuristic or that have been removed after removing non-text areas with ocr. Balloon detection almost never gives me a text area with no text which is why I do that at the end.

For an automatic detection of text areas this seems to give me the best results, combining both heuristic and balloon detection.

xulihang commented 3 years ago

What is the language of your comics? Combing different methods do improve the result, but if the OCR is good enough, you can entirely rely on it.

HeLan8 commented 3 years ago

I mostly translate japanese manga to english, when I tried removing non-text areas every ocr didn't work perfecly. They all always removed text areas that have text in them and tesseract ocr removes all text areas. But like I said heuristic doesn't detect everything so even with good ocr I would use balloon detection at the end.

xulihang commented 3 years ago

Some OCRs are good enough so that they can directly do the detection and recognition. You could try Google Vision: https://cloud.google.com/vision/docs/drag-and-drop and Clova: https://clova.ai/ocr.

But they are expensive. The way you just mentioned is more affordable.

HeLan8 commented 3 years ago

Yeah those services seem to be expensive for the amount of ocr I use. But there really is no need to go that far for me. My method gives me an accuracy of about 90-95% and to pay money to get it from 95% to 98 or 99% really isn't worth it.

xulihang commented 3 years ago

You mentioned you use Azure. But the default Azure plan has a usage quota.

You can use OCR.Space or Windows 10's built-in OCR. Actually they share the same OCR engine.

HeLan8 commented 3 years ago

I just noticed, once I reached the limit it started removing every single text area. But azure was enough to work on more than 1000 pages. If it's a daily limit than I can just wait, but if it has more restrictions than I'll switch to ocr space. I use baidu accurate for ocr on text areas though, it has the best results most of the time. It has a daily limit, but I'm not gonna use offline ocr like tesseract, they give really bad results compared to baidu accurate Aah...Windows 10, it has a lot of issues and I never had any problems with Windows 7 so I'm still using it, but I'm sure it will be helpful for someone else

xulihang commented 3 years ago

The default azure in ImageTrans seem to have reached its monthly quota. You could apply for your own account if this happens.

HeLan8 commented 3 years ago

I see, I tried to create an account, but it doesn't accept my debit card which means I can't create an account. I'll just switch to ocrspace, it worked fine as well