cloudy-sfu / GUI-for-tesseract-OCR

The GUI for Tesseract OCR software in Windows 64-bit platform
GNU General Public License v3.0
19 stars 0 forks source link

GUI Tesseract OCR Crashes. #2

Open AziRizvi opened 1 year ago

AziRizvi commented 1 year ago

If I try to OCR images that are kinda like this. 0_00_16_916__0_00_20_486_3000000000640010606400120

The GUI crashes, I even have the Japanese and Javanese language trained models inside of the models folder, the same issue persists even if I remove those language models, the GUI still crashes.

For simple images like these: 0_00_22_622__0_00_27_359_2000000500640005606400120

It works absolutely fine. Just thought I should report this little bug.

cloudy-sfu commented 1 year ago

Thanks for your feedback!

However, I cannot replicate this issue. My platform is Windows 10 and the latest release of this program. Could you provide more details about your platform and the Tesseract version?

Instructions: Help -> Version and supported language, then paste the information in the message box to here.

I tried to use with and without jpn and jav models, both clipboard and image mode, but cannot find any issue.

AziRizvi commented 1 year ago

I'm also on windows 10 and using the latest version of the program. I'm attaching a video explaining the problem, please watch it till the end.

https://github.com/cloudy-sfu/GUI-for-tesseract-OCR/assets/129892077/7a12b337-6aa2-455e-8cdf-a32cc8162392

If the image only contains simple alphabets and stuff, it OCRs them just fine, but whenever any image comes up with sort of "special" characters like the one I attached above, it crashes, if the "special characters" image is the first one in the folder it will crash immediately, if it's not the first one and is somewhere in the middle, it will OCR all of the images that are before that just fine and then crash immediately when one image with special characters or whatever comes up next.

I hope my video and what I've said is explanatory enough? Sorry if I'm not being properly expressive.

Here's the version and language section that you asked me to paste.

Tesseract version: tesseract 5.3.0 leptonica-1.82.0 (Jan 18 2023, 18:49:14) [MSC v.1934 LIB Release x64] libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.1.4) : libpng 1.6.39 : libtiff 4.5.0 : zlib 1.2.13 : libwebp 1.2.4 : libopenjp2 2.5.0 Pretrained language models: R:\OCR Software\gui-tesseract-v0.1.3-win64\raw\models Supported language: ['eng', 'jav', 'jpn', 'jpn_vert']

cloudy-sfu commented 1 year ago

Aha, very interesting!

I use the same version. I will take some time to inspect what causes the problem. Thank you again for your feedback.

Also, I noticed that when the program is cracked, you lose the progress of the whole folder. I understand it's annoying, and I'll add more try-catch clauses to make sure you still can finish most of your work.

AziRizvi commented 1 year ago

If I try to OCR the weird image alone, it works and the GUI does not crash however it crashes when they are in a folder I guess? I don't know.

explorer_ovoGiPYbpW

Here I tried to OCR the image alone and it worked.

The Images I'm working with are all Grayscale 8 bit images. Should I share the images?

I should share these specific images so you can check and play around and figure out exactly what's causing the issue. Here, I've uploaded the images (ALL of them) to my repository, you can download and check and it help you diagnose the problem easier.

Link to repository: https://github.com/AziRizvi/Dummy-Repository

cloudy-sfu commented 1 year ago

Thank you for sharing some test cases. Yes, I also noticed that maybe because some steps use different methods in single/batch recognition, and one of them crashes. Don't worry, and I'll inspect that.

Please allow me some time to finish it. I'm writing my degree dissertation now, and may release a new version several weeks later. :-)

AziRizvi commented 1 year ago

Thank you for sharing some test cases. Yes, I also noticed that maybe because some steps use different methods in single/batch recognition, and one of them crashes. Don't worry, and I'll inspect that.

Please allow me some time to finish it. I'm writing my degree dissertation now, and may release a new version several weeks later. :-)

Good luck with your dissertation!!

cloudy-sfu commented 10 months ago

Fixed, now it will throw an error in this case.

image