beetbox / pyacoustid

Python bindings for Chromaprint acoustic fingerprinting and the Acoustid Web service
MIT License
332 stars 66 forks source link

Look up didn't work for short samples (10->30sec) #70

Open baptistevericeldevialet opened 2 years ago

baptistevericeldevialet commented 2 years ago

Hi everyone,

I am trying to use pyacoustid to identify songs out of short chunks of music (typically 5 to 30 sec, kind of a Shazam use case). I didn't succeed in identifying Daft Punk's Get Lucky by providing pyacoustid only a fragment of the song (10 sec or even 120 sec from the beggining didn't work), whereas it worked if I provide the whole song.

Here is a snippet of my code:

filename = 'Get_Lucky_30sec.mp3'
duration, fingerprint = acoustid.fingerprint_file(filename)
data = acoustid.lookup(API_KEY, fingerprint, duration)

Providing the result of fpcalc and asking for the query through the web API gives the same result.

For instance if I call fpcalc with the first 30 seconds of the song, I get the following fingerprint result:

./fpcalc /Users/baptistevericel/Documents/dev/tshoko/acoustid_test/var/done/GetLucky_30sec.mp3

DURATION=30
FINGERPRINT=AQAA3UmSTFKaKEkSPFuM6ULZwz9-uDpK8tBiI78k-Ec94SJCeoboFzl6dmgiHmW4cHjxo7kQnjGSXD8eMgdz9Fl-HA9G5MwsQ__wo1aCdJ0tVBuPvKGO_iGa0FlxPoHyJh-aNGgXsBZerD-eB48ELTzKJSVqSYdH5kh_GWGz4WlxSZqg_SKmkBfO_LiyBvrR12gsIj_Oo_8I_3CHU8OUHBdNFLaU48fVGvnxHOLRH_-Ms-h-yD_OXfglVFSGftBfXMejH9aHSxmmqIeeG82YC5Uf0Dy-o18STYbj5DDjo2uW49C046GDd8SzIUueI190jD904kqOB2Hko99x486LHo2T5Yj4Clq0RemR5Rua_Hhk-EekT8JZaRD74FOLPAu2H48RXnA8g10P-iF-sN3h5_BzlMdvaHERHr0O_8JMIh9Eb8iL73CUoeyOFxeaSgoR6fuh58jDcmDeo9d-4MFvzMyRc8rwHKUWNBlv_HjoFM5lPMoenEafY6OEqj9YC_j6BDN16FGOZriQalSOPtHR7Ai1bMbnQt_R7MdzEZP2dLiyB8_hh9CDH-9V9Jdw5IZTHf3xB421HPh4PBvCE78hxoV-rE0bXOjz4SW-hSJ6ToBmHWOMEk4w6AhyCBgiBVJEICAE4AIAxQQyigFgBDCEGAQEI4ILIYhChCgGAEIISIEAAgAJLIEByGgmgCIAQCOIMgAgggAjDAhAlGHEQOcAUoABZhkCBhLGDBGFCAqMAopQIB1AzDCggBIYAAQBMgJABJlxwkCQCBFgCIAIEAIIwIBBSikiCEMGMGaMEcAIoYAgkhAjICMAAWWAEQABQZxRgjDkjPDCIWSgEgIIBg

Trying a lookup through the web api with the following request gives an empty result:

https://api.acoustid.org/v2/lookup?client=m_ngzjtGqA4&meta=recordings+releasegroups+compress&duration=30&fingerprint=AQAA3UmSTFKaKEkSPFuM6ULZwz9-uDpK8tBiI78k-Ec94SJCeoboFzl6dmgiHmW4cHjxo7kQnjGSXD8eMgdz9Fl-HA9G5MwsQ__wo1aCdJ0tVBuPvKGO_iGa0FlxPoHyJh-aNGgXsBZerD-eB48ELTzKJSVqSYdH5kh_GWGz4WlxSZqg_SKmkBfO_LiyBvrR12gsIj_Oo_8I_3CHU8OUHBdNFLaU48fVGvnxHOLRH_-Ms-h-yD_OXfglVFSGftBfXMejH9aHSxmmqIeeG82YC5Uf0Dy-o18STYbj5DDjo2uW49C046GDd8SzIUueI190jD904kqOB2Hko99x486LHo2T5Yj4Clq0RemR5Rua_Hhk-EekT8JZaRD74FOLPAu2H48RXnA8g10P-iF-sN3h5_BzlMdvaHERHr0O_8JMIh9Eb8iL73CUoeyOFxeaSgoR6fuh58jDcmDeo9d-4MFvzMyRc8rwHKUWNBlv_HjoFM5lPMoenEafY6OEqj9YC_j6BDN16FGOZriQalSOPtHR7Ai1bMbnQt_R7MdzEZP2dLiyB8_hh9CDH-9V9Jdw5IZTHf3xB421HPh4PBvCE78hxoV-rE0bXOjz4SW-hSJ6ToBmHWOMEk4w6AhyCBgiBVJEICAE4AIAxQQyigFgBDCEGAQEI4ILIYhChCgGAEIISIEAAgAJLIEByGgmgCIAQCOIMgAgggAjDAhAlGHEQOcAUoABZhkCBhLGDBGFCAqMAopQIB1AzDCggBIYAAQBMgJABJlxwkCQCBFgCIAIEAIIwIBBSikiCEMGMGaMEcAIoYAgkhAjICMAAWWAEQABQZxRgjDkjPDCIWSgEgIIBg

Response:

{"results": [], "status": "ok"}

Is this behavior expected or do I need to use the library differently?

Thanks for your feedback

twynb commented 1 year ago

That's expected behaviour - as the AcoustID FAQ states:

Can the service identify short audio snippets?

No, it can't. The service has been designed for identifying full audio files. We would like to eventually support also this use case, but it's not a priority at the moment. Note that even when this will be implemented, it will be still intended for matching the original audio (e.g. for the purpose of tracklisting a long audio stream), not audio with background noise recorded on a phone.