dpwe / audfprint

Landmark-based audio fingerprinting
MIT License
536 stars 121 forks source link

Correct matches, but incorrect timecodes #72

Closed josephschorr closed 4 years ago

josephschorr commented 4 years ago

Hi Dan,

First, thank you for this amazing library!

I'm performing matching of specific clips within a large audio file where I expect there to be, at times, multiple matches. In certain cases, I'm seeing the correct number and kind of matches (the correct clips exist in the wider audio file), but the time codes of the matches are much, much earlier than the actual locations in the audio file.

I'm making use of --find-time-range and I've tried adjusting --time-quantile without any success.

Any thoughts to why I would be seeing the correct matches, but with very inaccurate time codes? Is there, perhaps, a way to retrieve the full time range of the result of match_hashes? (likely I'm simply missing the call)

Thanks!

dpwe commented 4 years ago

Yeah, sorry, sounds like you fell for the --maxtimebits trap.

Times are stored in a fixed number of bits, and the timebase aliases beyond that. By default, it's 14 bits = 16,384 time frames. Each time frame is 256 samples @ 11 kHz (by default), or 23 ms, so a 14 bit timeline corresponds to 16384*0.023 = 6 min, 16 sec. So all times are differences between values projected down onto a circular 6m 16s axis, and so the differences are only correct up to some multiple of this.

To avoid this kind of time aliasing, you have to build your database with a larger --maxtimebits. Using, for instance, 4 extra bits for a total of 18 gives you 2^4 = 16x longer time axis, for up to 100 minutes.

The downside is that the timing and the track ID share a 32 bit word, so allocating 18 bits to time leaves only 14 bits for track ID, so each database will only be able to accept 16,384 reference tracks. I hope this doesn't impact you.

See SCALING at https://github.com/dpwe/audfprint

Best,

DAn.

On Wed, Sep 25, 2019 at 12:22 PM Joseph Schorr notifications@github.com wrote:

Hi Dan,

First, thank you for this amazing library!

I'm performing matching of specific clips within a large audio file where I expect there to be, at times, multiple matches. In certain cases, I'm seeing the correct number and kind of matches (the correct clips exist in the wider audio file), but the time codes of the matches are much, much earlier than the actual locations in the audio file.

I'm making use of --find-time-range and I've tried adjusting --time-quantile without any success.

Any thoughts to why I would be seeing the correct matches, but with very inaccurate time codes? Is there, perhaps, a way to retrieve the full time range of the result of match_hashes? (likely I'm simply missing the call)

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/72?email_source=notifications&email_token=AAEGZUPK6VBKUFZ2RMOVSQ3QLOF4HA5CNFSM4I2PDWJKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HNU3LCQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEGZUKZ4QORXR4CM7T3YZTQLOF4HANCNFSM4I2PDWJA .

josephschorr commented 4 years ago

Hi Dan,

Damned it... I thought it might be something like that but I missed maxtimebits! Fortunately, I only have 10-15 clips I need to compare against, so this should be perfect.

Thank you for your continued work and help on this amazing project!