Closed drduker closed 3 years ago
Hi @drduker,
those images are quite challenging for ssocr
. I can provide a few hints and already have example settings for the first image:
As general advice, a look at the debug image generated with -D
and called testbild.png
by default is usually required to find working settings. Adding debug output with option -P
prints information about the image and what ssocr
does, which can help to find out what goes wrong.
The command dynamic_threshold
is intended to adjust to changing brightness levels inside the image. But it relies on those changes taking place over larger areas than needed to definitely have both object and background pixels. This is not fulfilled in those images, I think, and thus dynamic_threshold
doesn't just work. But it can still be useful. It is often helpful to also use the option -a
when using the dynamic_threshold
command. The dynamic_threshold
command is quite slow.
To recognize the newer RSA tokens with a gap between digits three and four, it is often helpful to recognize both digit groups individually, when the brightness level is inconsistent. The crop
command can be used to select the appropriate image part.
To remove smaller connected components and keep only bigger ones to hopefully keep just the segments, the opening
command can be used. To close gaps in segments, the closing
command can be used.
The above can lead to the following two command lines (without debug options) to recognize the first image:
$ ssocr -d3 -a crop 0 0 858 510 dynamic_threshold 60 60 white_border 50 opening 18 closing 9 shear 30 issue14_pic01-641760.png
641
$ ssocr -d3 -a -t30 crop 933 0 765 510 dynamic_threshold 60 60 white_border 50 opening 11 closing 15 shear 30 issue14_pic01-641760.png
760
A method that should work better, but is not implemented in ssocr
, is to create a reference image showing only background (which is lighter than the foreground in this case) and then look at the absolute difference between the reference image and the image to recognize. Foreground (object) pixels would have high difference values, background pixels low difference values.
The reference image could probably be created by taking many images, enough that every segment is off in at least one of the images, and combining all those images by keeping the highest luminosity values for each pixel position. The pamarith program from Netpbm with option -maximum
may be usable, after converting the input images to GRAYSCALE PAM
or PGM
format.
The difference image for recognition could be created with the same pamarith
program with the -difference
option (all input images in PGM
or GRAYSCALE PAM
format).
The difference image should then be usable with ssocr -f white
(and possibly additional options and/or commands).
I would expect that other image manipulation tools, e.g., ImageMagick or perhaps GIMP, would allow creation of reference and difference images as well, but I have no idea, how.
But since I have never needed to use this method, I have neither tried it out nor implemented it in ssocr
. Please let me know if you try this method, and if it worked for you.
Thanks, Erik
I already spent too much time on this I think. I really appreciate the example, and it's good to know it's possible. I think i need a better camera for these. And the new rsa tokens purposefully have a concave top to limit ocr abilities. Which make it even more important to grab good pictures. I feel like I would need to print a dark box with a place for the camera and add led light to make me get the same type of picture every time. A raspberry pi camera might do this. But again, already spent too much time on this. I might swing by in a couple of months when i want to take another stab at it. I definitely learned some things with this.
I noticed that macos 11 preview app had improved ocr functionality offline so I tried automating with that. I'm so glad it works offline without the internet. There are so many good ocr services that just don't work offline and require "cloud connection". It's a shame. I'm glad I found ssocr for future things. My goal was to automating copying the rsa token to clipboard.
It would be great if you added different arch builds to your releases. That would be sweet. Instructions were fine for building myself but having the builds available to just download would have been nice.
Hi @drduker,
building a fixed setup for consistent pictures is really important when attempting reliable automatic OCR. Quite important is indirect lighting, in order to achieve consistent lighting across the whole display.
Feel free to contact me again if you want to give ssocr
another try in the future. If you would like to try the reference & difference image approach, and have a consistent setup, you could send me the set of images to create the reference image, and I would look into implementing the functionality in ssocr
, as time permits. Please note that ssocr
is a hobby project that I work on in my spare time only.
Regarding binary releases:
I understand that and why you would prefer binary releases. There was a time when I was not comfortable with building software myself.
But I can only test builds for my own system (currently Ubuntu GNU/Linux 18.04 LTS on x86_64). Those may or may not work on different GNU/Linux x86_64 systems, but not on macOS or Windows, or x86, or ARM, or any other OS or hardware platform. I do not think providing those binaries is that useful, and I expect there would be quite a few hard to debug problems for the users, where I cannot really help. Thus I do not want to provide those myself.
Third parties do provide ssocr
packages that can be installed via some kind of package management:
ssocr
is available via NixOS which may not only work on GNU/Linux, but even on macOS and Windows.
Binary ssocr
packages are available in Debian GNU/Linux and Ubuntu GNU/Linux.
ssocr
is available in the FreeBSD ports system.
Thanks, Erik
Any idea on how to process this image?