using .pklz against .pklz for match

utsav-195 commented 5 years ago

Hey, can we use a .pklz file against .pklz for querying(matching). For example: python3 audfprint.py match --dbase ads.pklz recs.pklz --find-time-range --exact-count --max-matches 200 --min-count 50 --opfile results.out

dpwe commented 5 years ago

I think you have to loop over the queries. This wouldn't be hard, but it's not part of the current command-line interface. You'd have to open the pklz file, retrieve the description of each file it contains (from hash_table.names) with hash_table.retrieve(), then do the match with matcher.match_hashes().

Pulling the individual track descriptions out of the hash table is inefficient; you're better off keeping each track as a separate precomputed .afpt file, then looping over those.

DAn.

On Fri, Dec 21, 2018 at 9:30 AM Utsav B. Shah notifications@github.com wrote:

Hey, can we use a .pklz file against .pklz for querying(matching). For example: python3 audfprint.py match --dbase ads.pklz recs.pklz --find-time-range --exact-count --max-matches 200 --min-count 50 --opfile results.out

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/53, or mute the thread https://github.com/notifications/unsubscribe-auth/AAhs0Q-Cd2Kfwc3mHOh0W5tHq1e0Ni0kks5u7PCJgaJpZM4ZeEZr .

bharat-patidar commented 5 years ago

Hi Dan, Your work is just wonderful. I have a similar question, Can we do .afpt vs .afpt match. Reason for asking this question is my query and reference item might get swapped. So i just want to preserve the fingerprint of both(query and reference). Also I see there are two ways to do the same where i can keep pickle or afpt file. For given scenario which one would you suggest or think would be easier to implement; pickle vs pickle or afpt vs afpt?

Thanks for the help!

dpwe commented 5 years ago

So, an afpt (precompute) file contains the landmarks from just one track, in simple time order. A pklz (database) file is designed to describe an entire collection of files, and is an index based on the landmark values (so the related tracks can be quickly retrieved given landmarks from the query). Matching a single pair of files (e.g., one afpt against another) is in principle much simpler and need not involve the indexed structure. However, audfprint only works from the index, so to match one afpt against another, you'll need to convert one into a very sparsely-occupied database, then match against it:

audfprint new --database dbase.pklz file_one.afpt        # Create dbase.pklz
audfprint match --database dbase.pklz file_two.afpt      # Perform the match

Hope that helps.

DAn.

bharat-patidar commented 5 years ago

Thank you so much Dan. I got the answer.

fat84 commented 5 years ago

hello to use for match on streaming url of audio type shoutcast . style dejavu, arccloud or echopring the echonest or spotyfy? any recomendation thanks

renbanford commented 5 years ago

Hi fat84, it would be more appropriate for you to create a new issue. Also, this issue would probably be better marked as closed.

dpwe / audfprint

using .pklz against .pklz for match #53