Normalize confidence to the length of the track

Apparently, 'fingerprinted_confidence' of DejaVu is proportional to the length of the track. It could only reach 100% if the entire track fits the recognition window. It means that tracks which do not fit the window might be treated unequally with regard to their duration. E.g., for a listening window of 2 seconds:

jingle of 3 seconds yielding 10% confidence --> should read 3 / 2 * 10% = 15%
jingle of 6 seconds yielding 10% confidence --> should read 6 / 2 * 10% = 30%

The confidence in the result in case 1 is actually higher, because DejaVu had listened for 2/3 of the track and only for 1/3 in case 2. So, for the same "reported" confidence within a fixed window it should result in a lower number for a "real" confidence. In other words, the result should be weighted, i.e., multiplied on a coefficient "length of the track" / "window length".

Note 1: this only makes sense for windows lengths where DejaVu demonstrates reliable detection, i.e., for 2+ seconds. Going below 2 seconds would likely result in noise amplification and in many false positives.

Note 2: DejaVu does not record in its database track length required for this calculation. However, it could be inferred with reasonable precision (circa 0.1%) from fingerprint offsets. db-djv-pg tool is already doing this calculation for the purpose of statistics.

denis-stepanov / advent

Normalize confidence to the length of the track #72