auerswal / ssocr

Seven Segment Optical Character Recognition
https://www.unix-ag.uni-kl.de/~auerswal/ssocr/index.html
GNU General Public License v3.0
202 stars 38 forks source link

-W and -H don't seem to work on this image? #20

Closed chconnor closed 1 year ago

chconnor commented 1 year ago

Hello!

With this image, using this command:

ssocr rotate 359 crop 2394 960 612 327 shear 30 -H 6 -W 2.5 -P -Dssocr_out.png -d -1 -T ssocr_demo.jpg

I get ".82.8" instead of 82.8, with the following stdout... it seems to be finding a single-pixel decimal point, despite my specification of minimum size via ratios specified with -H -W.

Debug image here.

Anything I can be doing better? Or is it a bug? Thanks!

================================================================================
flags & VERBOSE=4
thresh=50.000000
flags & PRINT_INFO=0
flags & ADJUST_GRAY=0
flags & ABSOLUTE_THRESHOLD=0
flags & DO_ITERATIVE_THRESHOLD=2
flags & USE_DEBUG_IMAGE=8
flags & DEBUG_OUTPUT=0
flags & PROCESS_ONLY=0
flags & ASCII_ART_SEGMENTS=0
flags & PRINT_AS_HEX=0
flags & OMIT_DECIMAL=0
flags & PRINT_SPACES=0
flags & SPC_USE_AVG_DST=0
need_pixels = 1
ignore_pixels = 0
number_of_digits = -1
foreground = 0 (black)
background = 255 (white)
luminance  = Rec709
charset    = full
height/width threshold for one   = 3
width/height threshold for minus = 2
max_dig_h/h threshold for decimal = 6
max_dig_w/w threshold for decimal = 2
distance factor for adding spaces = 1.40
optind=10 argc=20
================================================================================
argv[argc-1]=000001.jpg used as image file name
loading image 000001.jpg
image width: 5184
image height: 3456
0.00 <= lum <= 173.00 (lum should be in [0,255])
adjusting threshold to image: 50.000000 -> 33.921569
doing iterative_thresholding: 33.921569 -> 19.803922
using threshold 19.80
got commands rotate (argv[10]) 359 (argv[11]) crop (argv[12]) 2394 (argv[13]) 960 (argv[14]) 612 (argv[15]) 327 (argv[16]) shear (argv[17]) 30 (argv[18])
 processing rotate 359.000000 (from string 359)
 cropping from (2394,960) to (3006,1287) [width 612, height 327] (from strings 2394, 960, 612, and 327)
  cropped image width: 612
  cropped image height: 327
  9.00 <= lum <= 125.00 in cropped image (lum should be in [0,255])
adjusting threshold to image: 19.803922 -> 12.538255
doing iterative_thresholding: 12.538255 -> 27.058824
using threshold 27.06
 processing shear 30 (from string 30)
auto detecting number of digits: 5
digits are at most 117 pixels wide and 248 pixels high
found 5 digits
digit 0: (117,39) -> (118,40), width: 1 ( 0.20%) height: 1 ( 0.31%)
  height/width (int): 1, max_dig_w/width (int): 117, max_dig_h/height (int): 248
digit 1: (183,64) -> (300,312), width: 117 (23.93%) height: 248 (76.07%)
  height/width (int): 2, max_dig_w/width (int): 1, max_dig_h/height (int): 1
digit 2: (337,64) -> (451,311), width: 114 (23.31%) height: 247 (75.77%)
  height/width (int): 2, max_dig_w/width (int): 1, max_dig_h/height (int): 1
digit 3: (460,290) -> (481,310), width: 21 ( 4.29%) height: 20 ( 6.13%)
  height/width (int): 0, max_dig_w/width (int): 5, max_dig_h/height (int): 12
digit 4: (489,64) -> (606,311), width: 117 (23.93%) height: 247 (75.77%)
  height/width (int): 2, max_dig_w/width (int): 1, max_dig_h/height (int): 1
looking for digit 1
looking for decimal points
 digit 0 is a decimal point
 digit 3 is a decimal point
looking for minus signs
Display as seen by ssocr:
      _   _       _
     |_|  _|     |_|
  .  |_| |_   .  |_|

.82.8
using png format for debug image
writing debug image to file out.png
chconnor commented 1 year ago

Oh, I see, -W and -H do the opposite of what I expected. I thought they set a minimum size of a decimal point, they in fact set the maximum size.

Suggestion, change these man page lines:

   -H, --dec-h-ratio RATIO
       Set the max_digit_height/height ratio used for recognition of a decimal separator.  This value is used in combination with the max_digit_width/width ratio.

   -W, --dec-w-ratio RATIO
       Set the max_digit_width/width ratio used for recognition of a decimal separator.  This value is used in combination with the max_digit_height/height ratio.

To this:

   -H, --dec-h-ratio RATIO
       Set the max_digit_height/height ratio used for recognition of a decimal separator.  This value is used in combination with the max_digit_width/width ratio, and sets an upper bound to how tall a decimal point can be; both ratios must be exceeded to detect a decimal point.

   -W, --dec-w-ratio RATIO
       Set the max_digit_width/width ratio used for recognition of a decimal separator.  This value is used in combination with the max_digit_height/height ratio, and sets an upper bound to how wide a decimal point can be; both ratios must be exceeded to detect a decimal point.
auerswal commented 1 year ago

Thanks for letting me know!

I intend to update the documentation, and create a bug fix release of ssocr to address this. I intend to add a comment here when I have done this to let you know.

I have not yet closely examined your image, but if the problem really pertains to a single pixel, you might be able to remove it using the remove_isolated command. If there are larger groups of noise pixels, you might be able to remove them using the keep_pixels_filter NUMBER_OF_NEIGHBORS command.

Since the image is quite large, you may be able to adjust the minimum number of pixels required during image segmentation via the --ignore-pixels NUMBER option to require more than the ignored number of pixels to detect a foreground feature.

auerswal commented 1 year ago

OK, I've just tried it out, remove_isolated works:

$ ssocr rotate 359 crop 2394 960 612 327 shear 30 -H 6 -W 2.5 -d -1 -T remove_isolated ssocr_demo.jpg
82.8

Of course, I still intend to improve the documentation!

auerswal commented 1 year ago

I have just released ssocr version 2.22.2 with just documentation improvements (--help output, --debug-output output, man page, and web page).

chconnor commented 1 year ago

Thanks! Yeah, I used -i and it worked as well. Great to hear the doc was updated, thanks for the software!

chconnor commented 1 year ago

Another suggestion: the documentation for -r and -m could possibly be updated: -r sets the ratio "threshold" for the 1 digit, and I again assumed that was an upper bound on the ratio, but it's actually a lower bound, if I'm reading the code correctly.

Perhaps -n could/should be applied to detecting 1's and -'s as well? In general it would be nice to be able to specify minimum digit widths ("ignore any digit detected that is narrower than X") and it seems like -n would be a natural way to achieve the same thing.

Edit: I'll make an FR...

auerswal commented 1 year ago

Another suggestion: the documentation for -r and -m could possibly be updated: -r sets the ratio "threshold" for the 1 digit, and I again assumed that was an upper bound on the ratio, but it's actually a lower bound, if I'm reading the code correctly.

I did that already, when I updated the documentation regarding -H and -W, please see the man page.