alex-ong / NESTrisOCR

OCR for statistics in NESTris
24 stars 7 forks source link

Approach for more accurate color scanning #40

Closed timotheeg closed 4 years ago

timotheeg commented 4 years ago

Context

Color scanning is currently done with antiliased rescaling to 1x1 pixel. That averages the entire color region selected during calibration (full block fro stats pieces). That includes the "shine" white pixel, and sometimes the black border in cases of slight miscalibration. The full averaging yields colors which are quite far from the pure color they are meant to represent, and could sometimes cause color matching issues when scanning the field.

The "pure" color is meant to be at the lower right quadrant of the color blocks, but it would be too risky to ask the user to select just that area in the calibration UI.

Instead, this PR relies on calibration having selected the full blocks, but the code thereafter select a portion of that only to match the lower-right quadrant.

The PR includes a script for reviewers to be able to experiment on their own, script can be run from root folder as

python -m nestris_ocr.assets.sample_inputs.easiercap.test_color1

Approach

Only keep an area 40%x40% of the initial color area on the lower right quadrant. This is achieved by dropping edges as follow:

Tested on all level colors for my capture device. No color mismatches detected ✌️

Results

The approach yields better overall results with the final detected colors less "washed out", and the primary color components more prominent and closer to what we will capture from the field at run time.

During the PR, a script was created to visualize the colors computed and measure the performance cost, if any (answer to that, because there is no change to the runtime, there is no performance penalty. And because the extracted area to scale down is smaller, it could in fact be slightly faster.

Below are the result of the last run for each of the 2 colors of the 9 levels. For each color, there are 4 rows:

  1. current method in master, resize antialias to 1x1 (yields colors too washed out)
  2. blur before resize nearest to 1x1 (also not good)
  3. crop to lower-right quadrant, then blur, then resize nearest to 1x1 (better but slightly slower)
  4. crop to lower-right quadrant, then blur, then resize antialias to 1x1 (better at same cost)

samples_colors

alex-ong commented 4 years ago

Promising, could you test with Level 27 instead? And also a quick test to do the color lookup and print distance from the colors/black/white? With "more accurate" methods we would expect that the distance to the "correct color" is minimised.

Also I saw another OCR algorithm that just samples 4 pixels that are outside the shine area. It handles cropping beautifully too:

image

By grabbing these green pixels, it accounts for a +1px on any side while still being fast.

In terms of speed, i suspect that in terms of cost: capturing (biggest cost) resize (second biggest) blur(third biggest)

timotheeg commented 4 years ago

could you test with Level 27 instead?

Sure, I'll add new test assets in the folder and another script to report the distances.

I did consider using some specific pixels as you showed. I'll try that too.

The grunt work is what is very slow here, but hopefully the results will be useful for the future 🤞

alex-ong commented 4 years ago

In terms of speed, these are starting to get too slow (even the default one might be too slow):

resize antialias 0.01379704475402832 1.379704475402832e-05
resize antialias (48, 59, 187)

That's 0.0138 for 1000, or 2ms for 200 cells. Ideally we want sub millisecond.

I noticed a quirk(?) with your testing methodology Is this for field scanning, or for checking the two dynamic colors?

For Field, ideally we run an full-field filter first, then resize (whether anti-alias, or nearest, or box...), followed by extracting the values quickly.

This methodology has a huge overhead of creating images/resizing a single tile which while increasing accuracy by cropping out disgusting black portions, would add 200x image creations.

If you instead change the "base" image to construct a canvas of 20x10 blocks first; then that would remove the ability to blur correctly. Yikes...

timotheeg commented 4 years ago

That's 0.0138 for 1000, or 2ms for 200 cells.

Hey? No this is for color1 and color2, there are only 2 areas to scan, not 200

Is this for field scanning, or for checking the two dynamic colors?

The 2 dynamic colors

alex-ong commented 4 years ago

Just copypasta from discord:

For reference colors we can be as slow as we want to get the "true" color, since it's only run twice per frame or less with caching. This could involve cropping/removing black / white pixels (we should know what "white and black" are at this point. Or simply 3x3 pixel area in centre of block that dodges the white / black areas.

For field, we can compute the reference, color, then "add" 15 pixels worth of black and 3 pixels worth of white. This gives us a new reference color which should very very closely match fullfield.resize(filter=box), since each mino would include 15 pixels of black and 3 pixels of white

timotheeg commented 4 years ago

Updated the script to run against all the sample colors. Result below for all colors for all levels

There are 3 rows for each color, to represent:

  1. Current approach in master: resize to 1x1 with antialiasing
  2. Experiment 1: Blur, then resize to 1x1 with nearest neighbour
  3. Experiment 2: Crop to lower-right quadrant, Blur, then resize to 1x1 with nearest neighbour

The PR implements the 3rd approach in scan_helpers(), please check.

Coming next, running this against all the sample inputs and checking if:

I realized the confusion with the PR description is that the block image I used are from the field instead of being from the stats area. This PR is only meant to update how the reference colors are read, nothing to do with the field.

image

alex-ong commented 4 years ago

Is the bottom row the new approach? it seems more accurate to me :)

timotheeg commented 4 years ago

Is the bottom row the new approach?

Yes the bottom row is the new approach :)

Am still not done with the distance summary, but coming soon! :D

timotheeg commented 4 years ago

K, I added the script to check the distance improvements!

Oh boy! Did someone say always better?? 💪

python3 -m nestris_ocr.assets.sample_inputs.easiercap.test_distances
==========
level 0
old [0 0 0] [255 255 255] [ 91  81 190] [124 178 238]
new [0 0 0] [255 255 255] [ 67  50 209] [101 167 251]
board
matches 200
better_and_equal (77, 123)
differs 0
avg_gain 1213.045
==========
level 1
old [0 0 0] [255 255 255] [ 48 137  39] [151 200  61]
new [0 0 0] [255 255 255] [  5 122   0] [137 195  20]
board
matches 200
better_and_equal (81, 119)
differs 0
avg_gain 1562.115
==========
level 2
old [0 0 0] [255 255 255] [139  63 167] [208 123 242]
new [0 0 0] [255 255 255] [127  23 165] [213 108 255]
board
matches 200
better_and_equal (100, 100)
differs 0
avg_gain 948.76
==========
level 3
old [0 0 0] [255 255 255] [ 99  80 191] [116 209  85]
new [0 0 0] [255 255 255] [ 68  44 203] [ 98 213  62]
board
matches 200
better_and_equal (64, 136)
differs 0
avg_gain 957.22
==========
level 4
old [0 0 0] [255 255 255] [ 59 108 129] [205 155 132]
new [0 0 0] [255 255 255] [ 25  86 117] [207 144 115]
board
matches 200
better_and_equal (76, 124)
differs 0
avg_gain 846.1
==========
level 5
old [0 0 0] [255 255 255] [111 201 145] [158 154 245]
new [0 0 0] [255 255 255] [ 84 204 122] [147 136 255]
board
matches 200
better_and_equal (63, 137)
differs 0
avg_gain 699.245
==========
level 6
old [0 0 0] [255 255 255] [145  77  43] [116 119 119]
new [0 0 0] [255 255 255] [134  44   2] [86 90 89]
board
matches 200
better_and_equal (66, 134)
differs 0
avg_gain 1088.32
==========
level 7
old [0 0 0] [255 255 255] [111  65 185] [102  36  58]
new [0 0 0] [255 255 255] [ 93  27 191] [77  0 22]
board
matches 200
better_and_equal (74, 126)
differs 0
avg_gain 1190.615
==========
level 8
old [0 0 0] [255 255 255] [ 94  84 185] [154  81  50]
new [0 0 0] [255 255 255] [ 61  48 197] [145  45   2]
board
matches 200
better_and_equal (91, 109)
differs 0
avg_gain 1909.815
==========
level 9
old [0 0 0] [255 255 255] [147  75  47] [205 158  76]
new [0 0 0] [255 255 255] [140  42   5] [208 142  49]
board
matches 200
better_and_equal (95, 105)
differs 0
avg_gain 1419.57

I'd say this is good to merge, but feel free to run the scripts for yourself or try the branch first if you'd like :)

timotheeg commented 4 years ago

Eeew, the sample assets for level 4 are not correct, I'll fix them up after diner

timotheeg commented 4 years ago

I just realized that after the cropping to the more correct region, blur + scale nearest, is the same as scale antialias 😑😅

So I just restored that, which means this entire PR is actually only a very small change in scan_helpers to select a more meaningful area from the calibration the user did.

Below is the updated graphic to show the last 2 rows per color yield the same result.

Also, it means no slowness is introduced at all with the PR, the area adjustment is computed at bootstrap time and the code for processing is as before.

I suppose the addition of sample inputs and scripts that can (hopefully) be reused next time was still useful.

samples_colors

alex-ong commented 4 years ago

If you're selecting more better region, blur + nearest is the same as resize.box (I think??)

timotheeg commented 4 years ago

Report on the antialias versus box for the current setup.

First off visual results below, 3 rows per color:

  1. existing in master
  2. antialias
  3. box

color_compare

Second, run time performance of the 2 algos (1,000,000 iterations) (done for level0, color1)

antialias 6.329708099365234 6.329708099365234e-06
antialias (48, 137, 39)
cropped,antilias 5.549829006195068 5.5498290061950685e-06
cropped,antilias (7, 122, 1)
cropped,box 5.386526107788086 5.386526107788086e-06
cropped,box (8, 122, 1)

Crop+Box is fastest

Lastly distance comparison between antialias and box:

python3 -m nestris_ocr.assets.sample_inputs.easiercap.test_distances
Comparing cropAntialias vs cropBox
==========
level 0
cropAntialias [0 0 0] [255 255 255] [ 69  52 205] [102 168 249]
cropBox [0 0 0] [255 255 255] [ 69  53 206] [102 169 248]
board
matches 200
equal,better,worse (124, 36, 40)
differs 0
avg_gain 0.98
Comparing cropAntialias vs cropBox
==========
level 1
cropAntialias [0 0 0] [255 255 255] [  7 122   1] [138 197  24]
cropBox [0 0 0] [255 255 255] [  8 122   1] [139 197  25]
board
matches 200
equal,better,worse (119, 0, 81)
differs 0
avg_gain -15.045
Comparing cropAntialias vs cropBox
==========
level 2
cropAntialias [0 0 0] [255 255 255] [128  26 167] [211 108 254]
cropBox [0 0 0] [255 255 255] [129  28 167] [210 109 254]
board
matches 200
equal,better,worse (100, 0, 100)
differs 0
avg_gain -25.95
Comparing cropAntialias vs cropBox
==========
level 3
cropAntialias [0 0 0] [255 255 255] [ 71  48 204] [104 211  65]
cropBox [0 0 0] [255 255 255] [ 71  49 205] [104 211  66]
board
matches 200
equal,better,worse (136, 23, 41)
differs 0
avg_gain -6.235
Comparing cropAntialias vs cropBox
==========
level 4
cropAntialias [0 0 0] [255 255 255] [143  29  83] [ 79 203 116]
cropBox [0 0 0] [255 255 255] [142  29  83] [ 80 203 117]
board
matches 200
equal,better,worse (101, 0, 99)
differs 0
avg_gain -19.94
Comparing cropAntialias vs cropBox
==========
level 5
cropAntialias [0 0 0] [255 255 255] [ 83 203 120] [146 135 255]
cropBox [0 0 0] [255 255 255] [ 84 203 121] [147 136 255]
board
matches 200
equal,better,worse (137, 0, 63)
differs 0
avg_gain -16.58
Comparing cropAntialias vs cropBox
==========
level 6
cropAntialias [0 0 0] [255 255 255] [133  43   2] [88 94 92]
cropBox [0 0 0] [255 255 255] [133  43   2] [89 94 93]
board
matches 200
equal,better,worse (189, 1, 10)
differs 0
avg_gain -0.81
Comparing cropAntialias vs cropBox
==========
level 7
cropAntialias [0 0 0] [255 255 255] [ 94  30 187] [73  0 24]
cropBox [0 0 0] [255 255 255] [ 94  31 188] [73  0 24]
board
matches 200
equal,better,worse (165, 31, 4)
differs 0
avg_gain 4.63
Comparing cropAntialias vs cropBox
==========
level 8
cropAntialias [0 0 0] [255 255 255] [ 63  49 195] [140  47   4]
cropBox [0 0 0] [255 255 255] [ 63  49 195] [139  46   4]
board
matches 200
equal,better,worse (161, 9, 30)
differs 0
avg_gain -1.51
Comparing cropAntialias vs cropBox
==========
level 9
cropAntialias [0 0 0] [255 255 255] [139  43   8] [207 141  49]
cropBox [0 0 0] [255 255 255] [139  43   7] [207 141  49]
board
matches 200
equal,better,worse (164, 36, 0)
differs 0
avg_gain 2.7

It's sort of inconclusive for distances, it works slightly better for some levels, slightly worse for other, but either way, the loss is very very small.

Conclusion: because box is slightly (very slightly faster than antialias, I'm moving to it.

I have committed the modified scripts which allowed to run the above as well as updating the scaling algo to BOX

timotheeg commented 4 years ago

So... Can merge? 😅