Add ability to OCR the bus odometer

ekimekim commented 10 months ago

This is a big one and it's a mess, sorry. I'll try to clean it up a bit before merging.

This adds a component bus_analyzer. It watches segment directories and runs an analysis against each segment file. Right now the only thing that analysis does is OCR the current bus odometer.

It places this result (or NULL if it couldn't read the odo) into a database table bus_data. This table stores an odometer reading for each segment, along with a timestamp.

Finally, thrimshim has a method added that can read this table to fetch the latest reading. I put this in thrimshim because restreamer doesn't have database access.

We intend to use this endpoint on the VST website backend to drive some automation around "next point" etc.

An important point to note is that this OCR only works reliably on a high resolution BusCam, not on the small buscam in the corner of the main stream. We assume this is available under the channel name buscam (though this is of course configurable).

The OCR is very basic right now and only tries to read the first 4 digits, not the 1/10th mile digit which is a different shape and moves in 8 animation frames instead of ticking over instantly.

The OCR is based on the following process:

Cut out a vertical slice of where the odometer digit might be, as it moves up and down
Find the region of the slice which has the most brightness, assume this is our digit
Normalize the digit by making it greyscale and adjusting the contrast
Compare it to a series of "prototype" digits. Comparison is based on the pixel error squared.
Whichever prototype is most similar to the digit is the winner, and we assign a score based on how close it was to the runner-up.
We average the scores for each digit in the image, and call that the frame's score
If the frame's score is above a threshold, we record a reading. Otherwise we record NULL ("could not determine a reading").

Future work:

Read the last digit
Read the in-between states of the last digit (eg. halfway between 1 and 2 should be read as 1.5)
Read the clock in addition to the odo
Read the other dials (such as speed) by looking at the angle of the green line

ekimekim commented 10 months ago

By way of example, here is a test frame: test-0113 25 We grab its odometer: odo Within the odometer, we find each digit and normalize it: test-0113 25-digit0 test-0113 25-digit1 test-0113 25-digit2 test-0113 25-digit3 test-0113 25-digit4 Then we compare each one to the prototypes. Here is debug output for comparing digit 3:

Digit = 3 with score 0.07662805074830303
0: 0.6749983397240747
1: 0.49500557305278614
2: 0.7607713328448434
3: 0.899052645935728
4: 0.589575909300537
5: 0.7584262193861606
6: 0.7154477278085176
7: 0.5766948239415489
8: 0.7621688962241706
9: 0.822424595187425

We have picked the value 3 because it had the highest similarity (0.899), with 9 being the runner up (0.822) for a final score of 0.899 - 0.822 = 0.077. This is actually quite a low score! 3s and 9s are very similar in this font. We are saved by the other digits all being a more sure bet, which brings up our overall score:

Digit = 0 with score 0.12030949885125786
Digit = 1 with score 0.22004621713938277
Digit = 1 with score 0.2500588058898393
Digit = 3 with score 0.07662805074830303
test-frames/test-0113.25.png: 0113 with score 0.16676064315719574

Our threshold is currently 0.1, because an all-black frame scores a 0.7 because 1s aren't very different from all-black. You can see why this is a crappy scoring system, but it works well enough for now.

chrusher commented 10 months ago

This is a good implementation of the chosen solution. Where there is room for improvement I feel is in deciding whether a match is good or not.

dbvideostriketeam / wubloader

Add ability to OCR the bus odometer #356