adblockradio / stream-audio-fingerprint

Audio landmark fingerprinting as a Node Stream module
Mozilla Public License 2.0
768 stars 64 forks source link

Using shorter intervals? #4

Closed alenl closed 6 years ago

alenl commented 6 years ago

Hi @dest4 ,

If I'm not mistaken, you use 12-second windows to generate fingerprints, thus it is not possible to use this to match parts of audio that are shorter than 12 seconds and that are spaced by less than 12 seconds from a longer stream. Is this correct? Would it be possible to do it my by changing some of the constants and which (e.d. to make it work with 3 or 4 seconds)?

To give a bit of context - I have a longer audio recording (a voice over recording session - several hours), from which a selection of takes have already been cut out into separate files and slightly processed (so volume and quality is not the same). I need to find at which positions (in seconds) did each of the files originate. As it is right now, for some of them it doesn't find a single match - presumably because they are too close together when in the large file.

Thanks!

dest4 commented 6 years ago

Hi, The figure in the readme is a 12s audio sequence example, containing many fingerprints (the grey lines). You do not need at least 12s of audio to get fingerprints: you just need audio sequences of the order of WINDOW_DT * DT = WINDOW_DT * (NFFT / 2 / SAMPLING_RATE) = 96 * (512 / 2 / 22050) = 1.11 s to have the chance to get fingerprints. So, takes of 3 or 4 seconds are perfectly adapted in my opinion with the current values of the parameters. Note that due to the implementation of PRUNING_DT there will be unfingerprinted audio at the end of each file, here 0.28 s. This library is primarily intended for use in continuous streams.

Is the processing of your files causing the problem? More precisely, are all your inputs for fingerprinting of the same sampling rate? Have you tried matching unprocessed takes? (you could do a few separate unprocessed takes yourself to test this)

If you do more significant processing, like voice enhancement or equalizer/EQ, my fear is that the tool you want to use - acoustic fingerprinting - is not perfectly adapted to your needs.

alenl commented 6 years ago

Strange. That would mean it should have worked. We ended up not using this in the end because we were able to solve our problem through a completely different approach.. But thanks for the reply.