Closed Kamikadashi closed 2 years ago
I can't run it under Windows 10 and its interface is all Japanese.
I can run it under Windows 7, but the result is not good. Maybe it is because the system's language is Chinese.
Yes, it seems to be the system language that is at fault here. You can fix it by running the application through https://pooi.moe/Locale-Emulator/ or something similar. However, it is very strange that it does not run on Windows 10 as the main difference between version 16 and version 15 is the support for Windows 10 and higher. But anyway, you don't need to run the main program at all.
You can find the two applications I mentioned in the installation folder. The first one is the folder watcher, and the second one is the clipboard watcher. ClipOCR will require Japanese system language as well, but FWatch doesn't. I translated a few menus to make it easier to understand what does what: ClipOCR:
FWatch:
Okay, I've made it work on Windows 10.
I think the folder watcher is the best way to integrate it into ImageTrans. I've made a plugin for it.
https://github.com/xulihang/ImageTrans_plugins/commit/2a5fc70b6f0a0de2f36949f5d5fd8dc1093dd53a
Unzip the plugin files to the plugins folder and set up the watch folder path and timeout in ImageTrans's preferences.
I check strip furigana to make the results better.
I think it does not perform well on text with complex background or a low resolution. Sometimes tesseract and Windows 10's OCR give a better result.
Thank you a lot, I just tested it, and it works! It’s true that it doesn’t deal well with complex backgrounds but thankfully not all manga has a lot of them. It’s fast though, and quite accurate. The accuracy can be further improved by batch upscaling images with the nearest neighbor before importing them into ImageTrans; the process takes less than a minute or so for a whole volume. Or using waifu2x-caffe if you have a capable gpu, but that’s admittedly more time-consuming. Strangely enough, if some kanji got recognized wrong, moving a box a bit to the side and clicking ocr again usually fixes it.
I encountered a problem when dealing with low-quality scans relating to furigana remover (?) though. The problem appears at some particular bubbles and persists no matter whether the image was upscaled or not, so I don’t think that’s the reason. Some examples (in short, the algorithm sometimes removes text instead of furigana, or overlooks furigana instead): https://imgur.com/a/67BGd3r
About upscaling, in the current version, only if the width or height of the cropped area is smaller than 50px will the program scale it up. There is an ncnn version of waifu2x https://github.com/nihui/waifu2x-ncnn-vulkan, which is very fast. Maybe I can integrate it into the program.
About moving a box a bit to improve the result, it is true but I don't know the reason.
About the furigana remover, if the furigana is connected to the kanji, the current method cannot remove it perfectly. Could you send a batch of the original text area images so that I can have a test?
I've made some improvements about Japanese OCR in recent versions.
In 1.4.5, a new furigana stripping method is added. In 1.4.6, a new line mode for tesseract: https://github.com/xulihang/ImageTrans-docs/issues/87
Now, tesseract should be able to outperform Yomikaku.
I haven't tested the new tesseract mode extensively yet. However, from what I've seen, it unfortunately still doesn't outperform yomikaku on the scans I tried it on, getting kanji wrong more often than its counterpart.
The new furigana stripping method looks promising, though. Did it replace the previous one, or is it available as an option somewhere in the settings? I may be wrong as I haven't tested anything extensively yet, but at first glance, I didn't see any problems with the vanishing text I encountered earlier, so that's something.
Regarding upscaling, not all models are equally suitable for improving OCR, and I've found only cunet and upresnet10 models to give consistently better results. For extreme cases, ESRGAN with the 8x_NMKD-Typescale_175k model can be used with Yomikaku to get an ocr rate even for scans of this quality to a decent enough level of 98% accuracy or so. Still, this model can't be used with ai based ocr services as with them results get worse instead. And besides, it's too slow and resources heavy, so I don't think it's feasible to integrate it into imagetrans. Cunet and upresnet10 are good enough most of the time though and ai ocr handles them well.
But Yomikaku doesn't handle text on complex backgrounds or light text on dark backgrounds, so it would be preferable to find something better for such cases.
I just tested one bubble. There should be a better way to test this, like using a dataset, but I haven't tried,
Yomikaku:
Tesseract (line mode)
As the line mode precisely cuts text block into lines, it should have a high accuracy. I found that this mode also works for Yomikaku. Maybe I will enable line mode for all offline OCR engines.
Commercial OCRs like baidu_accurate can recognize all the characters:
OCR the same bubble after using waifu2x.
Yomikaku:
Tesseract:
Tesseract has a problem that it may repeat the recognized character. This may be because it uses a CTC algorithm,
waifu2x is not enough for this quality unfortunately. Yomikaku with 8x_NMKD-Typescale_175k model:
I see and Yomikaku do have a better accuracy recognizing Kanji,
It consistently can't recognize ハ and mistakes it for ( though.
Also, you may know this already, but apparently there is a way to get Google OCR to work via Google Drive API without resorting to Vision API. Like here: https://github.com/ttop32/JMTrans It would be cool to see this implemented.
I think moving a box a bit to improve the result is a thing for the same reason why some characters get ocred incorrectly if "strip furigana" or "vertical to horizontal" options are enabled. ImageTrans handles all textboxes as lossy jpg instead of any lossless formats, so the pixel representation of symbols changes slightly after any consequent read/write operation.
Yes, I do use JPEG with 100 quality. I am not sure if it has much influence.
Dim out As OutputStream=File.OpenOutput(imgPath,"",False)
img.WriteToStream(out,"100","JPEG")
out.Close
Hi! Could you please consider adding support for Japanese 読取革命 ocr? I've found it, on average, give results equal or even better than those of google ocr and others, especially when dealing with bad quality low-resolution scans. It's significantly better (and cheaper) than Abbyy and doesn't require an internet connection as well. 読取革命 comes with a folder watcher that monitors a folder specified by the user and ocrs any image as soon as it appears in it as well as a program that ocrs any image that gets copied to the clipboard, so I think there are ways to make it work.
The free trial is available here: https://download4.www.sourcenext.com/yomikaku16/YOMIKAKU16.exe And apparently, both the folder watcher and the clipboard watcher keep working even after the trial expires, retaining all their functionality (you can run them from the installation folder without registering anything). The quality of the ocr can be further improved by converting the image into an indexed one with a palette of only two colors, so it would be fantastic if you added it as a pre-processing option. Still, it's by no means a requirement. Here are some examples: https://imgur.com/a/ZibcALR