Closed waraiotoko2501h closed 3 years ago
Try add
"1800f800f801f001e0000f001ff03ff81cfc007f803fe01ffc07fe01fe001c00": "ジ"
to ocr_labels.json
Thanks. It worked. But immediately after, I was asked for input. I'm frequently asked to enter numbers. // The "launcher.cmd" will not prompt you for input.
This project is using a hash based OCR, so gpu setting should not be related.
For unknown reason(maybe clearType setting?) your text is rendered in different pixel value from my samples. Manually label them should work.
There seems a launcher bug that not allow standard input, you can use the command line directly for now.
And if you finished the text labelling, please send a PR with updated ocr_labels.json
, so others don't have to do it again.
I see. I understand.
Is there a best way to enter the hash manually? How did you come up with this hash?
"1800f800f801f001e0000f001ff03ff81cfc007f803fe01ffc07fe01fe001c00": "ジ"
Is there a best way to enter the hash manually?
If you run py -3.8 -m auto_derby nurturing
directly from a admin command prompt, you can input text in terminal.
How did you come up with this hash?
You can find it in debug log. current similarity threshold for ocr is 0.8, but シ
has similarity 0.93 with ジ
.
You can also input through launcher now, The text prompt somehow show after input, it should be show before input.
I've been adding the hashes manually. However, it is still very difficult to distinguish between "ジュニアクラス(junior class)" and "シニアクラス(senior class)".
It seems that the hash of "ジ" registered for the junior class is also recognized as "ジ" for the senior class.
This may not be a good solution, but I solved it as follows. https://github.com/NateScarlet/auto-derby/blob/451e2122cb8e252c9f4f646d4e7f866576a1c338/auto_derby/single_mode/context.py#L38
year = {
'ジュニア級': 1,
'シュニア級': 1,
'クラシック級': 2,
'シニア級': 3,
'ジニア級': 3,
}[year_text]
this 3800...3c00 hash not exists in my ocr_labels.json
.
you should just correct label data
you can check image for that hash in ocr_images.local
folder
After several attempts, adding it to ocr_labels.json alone did not solve the problem.
3800.... .3c00 hash will be similarity=1.000 for both when junior class and senior class.
"シ" or "ジ" problem is can happen with any other hash.
For both the senior and junior classes, these hashes will have similarity=1.000. In other words, if I register as "ジ" in the "ジュニア級", it will be recognized as "ジ" in the "シニア級" and you will get a Key error.
There may be others.
"1800f800f801f001e0000f001ff03ff81cfc007f803fe01ffc07fe01fe001c00": "ジ",
"3800f800f801f001e0010f001f703ff83cfc007e803fe01ffc0ffe01fe003c00": "ジ",
"1800f800f801f001e0000f001ff03ff81cfe007f803fe01ffc07fe01fe001c00": "シ",
"fc01fc01f801c0000f001f603ff83cfc10fe007fc03ff01ffc07fe01fe003c00": "シ",
I'm attaching it in case you need it. ocr.zip
I understand a little bit about how this program works, but... It seems like a difficult problem to solve. I'm going to try to find out if it's a problem with clear type text.
Thank you for your support. (My English is not very good, I apologize for that.)
The top-right ゛
is not included in ocr image, you should not mark this as ジ
Base on given image, these hash should all be labeled as シ
Are you sure you see these hash while ジ is displaying ?
Maybe image pre-processing is incorrectlly removed ゛
from image,
I need a reproducable case (a screenshot that ocr recognize ジ as シ) to work on.
Are you sure you see these hash while ジ is displaying ?
yes. Please wait while I recreate the situation.
I found the fisrt screenshot of this issue is the reproducable case, i could just run test against it
I will work on this tonight
P.S. The clear type text had nothing to do with it. I toggled between enabled and disabled, rebooted, and tried again, but there was no change in the hash value.
My solution to this problem:
The ゛
become very blurry, image pre-processing need improve
resize + sharpen can solve this problem
Thanks for the response the other day. The program was working fine, but the current version is failing to OCR.
OCR frequently mistakes "ジ" for "シ".
Is there something wrong with my settings? I have not changed any settings in NVIDIA Control Panel -> Manage 3D Settings. (Factory default)
Does the graphics board or Windows settings affect OCR?
TEST VERSION fed9276f2a02a770a3116211a3ce42b7bde10c9b launcher.log
If you need any other logs or screenshots, please let me know.