Closed HeLan8 closed 3 years ago
The detection model has to be trained using DarkNet or TensorFlow Object Detection API. There is a pretrained one you can find here: https://github.com/xulihang/ObjectDetector/releases/download/models/yolo_darknet.zip. Unzip it to ImageTrans's folder and check the offline checkbox.
Thank you, works well. I find balloon detection to be the best way to create automatic text areas, heuristic for some reason often creates several text areas within one bubble which makes it kind of useless.
heuristic for some reason often creates several text areas within one bubble
You have to adjust its params like height span and width span.
You're right. I've increased the number and it does work properly now. Though I do wonder why did you set those numbers as default? It never worked for whatever I used, so I wonder if it's fine for other people
I understand now, it's a great thing that heuristic enables that much customization. I'd say the default is better for when you make custom translation on your own, but you do advertise your program in some places as an automatic manga/comic translation tool, so if someone buys the program for that purpose I'd say having that kind of default would counterproductive
That's right. I'd better make the program be able to determine the params by itself.
Here is what I found to be the best method to create automatic text areas with the highest accuracy. I first do text area detection heuristic with text area confidence. After that I remove low confidence areas for all pictures, then I remove non-text areas with azure ocr (which seems to be the best option on this operation). Removing non-text areas gets rid of both text areas with no text and sometimes of text areas that have text in it. So after that I use the offline balloon detection to get text areas for balloons that either haven't been detected by heuristic or that have been removed after removing non-text areas with ocr. Balloon detection almost never gives me a text area with no text which is why I do that at the end.
For an automatic detection of text areas this seems to give me the best results, combining both heuristic and balloon detection.
What is the language of your comics? Combing different methods do improve the result, but if the OCR is good enough, you can entirely rely on it.
I mostly translate japanese manga to english, when I tried removing non-text areas every ocr didn't work perfecly. They all always removed text areas that have text in them and tesseract ocr removes all text areas. But like I said heuristic doesn't detect everything so even with good ocr I would use balloon detection at the end.
Some OCRs are good enough so that they can directly do the detection and recognition. You could try Google Vision: https://cloud.google.com/vision/docs/drag-and-drop and Clova: https://clova.ai/ocr.
But they are expensive. The way you just mentioned is more affordable.
Yeah those services seem to be expensive for the amount of ocr I use. But there really is no need to go that far for me. My method gives me an accuracy of about 90-95% and to pay money to get it from 95% to 98 or 99% really isn't worth it.
You mentioned you use Azure. But the default Azure plan has a usage quota.
You can use OCR.Space or Windows 10's built-in OCR. Actually they share the same OCR engine.
I just noticed, once I reached the limit it started removing every single text area. But azure was enough to work on more than 1000 pages. If it's a daily limit than I can just wait, but if it has more restrictions than I'll switch to ocr space. I use baidu accurate for ocr on text areas though, it has the best results most of the time. It has a daily limit, but I'm not gonna use offline ocr like tesseract, they give really bad results compared to baidu accurate Aah...Windows 10, it has a lot of issues and I never had any problems with Windows 7 so I'm still using it, but I'm sure it will be helpful for someone else
The default azure in ImageTrans seem to have reached its monthly quota. You could apply for your own account if this happens.
I see, I tried to create an account, but it doesn't accept my debit card which means I can't create an account. I'll just switch to ocrspace, it worked fine as well
Does the offline balloon detection require to set up the api url in the settings or do I have to do something else for it. When I check the box it tells me: Model is not placed correctly. I'm not sure how I have to set up the offline detection