worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.45k stars 1.44k forks source link

A directory of potentially duplicate audio files? #289

Open mastercoms opened 1 year ago

mastercoms commented 1 year ago

How should I use the dejavu API if I have a whole directory of audio files (around 2000), where I don't know which are duplicates of each other? Fingerprinting and recognition seem to be separate processes, does fingerprinting have only one ID if they're duplicates?

Note: I've already done hashing of the audio streams to find exact duplicates, so I'm just trying to fingerprint at this point.