worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.36k stars 1.43k forks source link

Fingerprints/hashes generated are not similar! #211

Open schoudhary101 opened 4 years ago

schoudhary101 commented 4 years ago

I have two audio files A and B, B being cut from A i.e B is a subset of A. When I generates hashes for A and B, (A has more hashes than B as B is a subset), not all the hashes of B matches with A, though a lot do. I deally, all the hashes of B should be a subset itself of hashes of A, which is not the case. My major task is alignment. Can you help me out!

evictor commented 1 year ago

I probably can't help you directly, but you should post a diff-style screenshot with A on top half and B on bottom half, graphs with same X scale where X is time (i.e. A would be wider and B would be beneath it aligned somewhere along X by time) and Y is whatever you need to do to convey hash (e.g. int representation of the hash). That way, one could see quite easily and quickly the apparent qualitative differences, like for example what approx. % of hashes were correct, where were the correct ones, etc. and note other potential anomalies that help debug, e.g. all the correct hashes were at the end (or other patterns of that nature).

I will say that B being a subset of A will most likely not have quantized along the same "beat", which is why there is some concept of "overlap" for the windows used to pick out predominant frequencies. I probably botched the terminology, I can't remember what the vars are called in the code.

evictor commented 1 year ago

To add on to the prev. comment re: windows and overlap, what you're seeing would happen if the overlap were too low such that shifts in quantization of the sample vs. the full would cause inadequate coverage of the full at various shifts within beats. Again, hope my terminology (inspired by digital music) makes sense to you.

Re: the graph/chart, I was thinking like 2 separate bar charts on the upper and lower halves of image, but it might make more sense to either put them on the same chart in different colors with presentation options being overlapping bars vs. side by side bars vs. stacked vs. computed delta (0 means equal bars, then deviations would be apparent, but I think the ints are effectively random hash values in which case the value of the deviation is meaningless).