dpwe / audfprint

Landmark-based audio fingerprinting
MIT License
536 stars 121 forks source link

Match timestamps are not accurate #58

Closed dannykopping closed 5 years ago

dannykopping commented 5 years ago

Hey @dpwe - great lib!

I'm struggling with a bit of an issue, I'm hoping you can shed some light on it.

Here's a screenshot of an audio file I've authored with 3 clips interwoven at various points in the timeline: image

The bottom track is where I've inserted some audio markers (jingles in my case). You can see markers at roughly 03:45, 06:45 and 11:20.

These are the commands I used, and the result I received:

➜  audio-fingerprinting rm fpdbase.pklz; python3 audfprint/audfprint.py new --dbase fpdbase.pklz samples/JingleMultiple.mp3 -n30
Thu Feb 21 23:41:25 2019 ingesting #0: samples/JingleMultiple.mp3 ...
Added 77002 hashes (30.9 hashes/sec)
Processed 1 files (2488.7 s total dur) in 11.7 s sec = 0.005 x RT
Saved fprints for 1 files ( 77002 hashes) to fpdbase.pklz (0.00% dropped)
➜  audio-fingerprinting python3 audfprint/audfprint.py match --dbase fpdbase.pklz samples/jingle.wav -n30 -x4 -R                       
Thu Feb 21 23:41:54 2019 Reading hash table fpdbase.pklz
Read fprints for 1 files ( 77002 hashes) from fpdbase.pklz (0.00% dropped)
Thu Feb 21 23:41:56 2019 Analyzed #0 samples/jingle.wav of 5.851 s to 403 hashes
Matched    0.7 s starting at    0.9 s in samples/jingle.wav to time  308.4 s in samples/JingleMultiple.mp3 with    31 of   117 common hashes at rank  0
Matched    1.4 s starting at    0.4 s in samples/jingle.wav to time  226.3 s in samples/JingleMultiple.mp3 with    24 of   117 common hashes at rank  0
Matched    1.0 s starting at    0.5 s in samples/jingle.wav to time   27.2 s in samples/JingleMultiple.mp3 with    21 of   117 common hashes at rank  0
Processed 1 files (6.0 s total dur) in 2.2 s sec = 0.373 x RT

Now, I would've expected matches at (roughly) 225s, 405s and 685s respectively, but the matches are not in the correct place at all. There's nothing in the original track (i.e. not the markers inserted) that is even remotely close to the marker, so I'm not sure where these numbers are coming from.

Any assistance would be greatly appreciated!

dpwe commented 5 years ago

At first glance, this looks like you're hitting the time-aliasing limit. The default settings are tuned for pop-song like durations, and beyond about 6 minutes the times can "wrap around" - see the final paragraph in the README https://github.com/dpwe/audfprint/blob/master/README.md.

tl;dr: try rerunning everything with --maxtimebits 16.

HTH.

DAn.

On Thu, Feb 21, 2019 at 4:50 PM Danny Kopping notifications@github.com wrote:

Hey @dpwe https://github.com/dpwe - great lib!

I'm struggling with a bit of an issue, I'm hoping you can shed some light on it.

Here's a screenshot of an audio file I've authored with 3 clips interwoven at various points in the timeline: [image: image] https://user-images.githubusercontent.com/373762/53203707-935b2700-3632-11e9-8e57-1f1f0377d4d9.png

The bottom track is where I've inserted some audio markers (jingles in my case). You can see markers at roughly 03:45, 06:45 and 11:20.

These are the commands I used, and the result I received:

➜ audio-fingerprinting rm fpdbase.pklz; python3 audfprint/audfprint.py new --dbase fpdbase.pklz samples/JingleMultiple.mp3 -n30

Thu Feb 21 23:41:25 2019 ingesting #0: samples/JingleMultiple.mp3 ...

Added 77002 hashes (30.9 hashes/sec)

Processed 1 files (2488.7 s total dur) in 11.7 s sec = 0.005 x RT

Saved fprints for 1 files ( 77002 hashes) to fpdbase.pklz (0.00% dropped)

➜ audio-fingerprinting python3 audfprint/audfprint.py match --dbase fpdbase.pklz samples/jingle.wav -n30 -x4 -R

Thu Feb 21 23:41:54 2019 Reading hash table fpdbase.pklz

Read fprints for 1 files ( 77002 hashes) from fpdbase.pklz (0.00% dropped)

Thu Feb 21 23:41:56 2019 Analyzed #0 samples/jingle.wav of 5.851 s to 403 hashes

Matched 0.7 s starting at 0.9 s in samples/jingle.wav to time 308.4 s in samples/JingleMultiple.mp3 with 31 of 117 common hashes at rank 0

Matched 1.4 s starting at 0.4 s in samples/jingle.wav to time 226.3 s in samples/JingleMultiple.mp3 with 24 of 117 common hashes at rank 0

Matched 1.0 s starting at 0.5 s in samples/jingle.wav to time 27.2 s in samples/JingleMultiple.mp3 with 21 of 117 common hashes at rank 0

Processed 1 files (6.0 s total dur) in 2.2 s sec = 0.373 x RT

Now, I would've expected matches at (roughly) 225s, 405s and 685s respectively, but the matches are not in the correct place at all. There's nothing in the original track (i.e. not the markers inserted) that is even remotely close to the marker, so I'm not sure where these numbers are coming from.

Any assistance would be greatly appreciated!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/58, or mute the thread https://github.com/notifications/unsubscribe-auth/AAhs0Xyo1flYclU5zHOlhshdlKrgbdC3ks5vPxSBgaJpZM4bIZZu .

dannykopping commented 5 years ago

Appreciate the swift response @dpwe. That indeed did the trick! I must've missed that final paragraph.