matching many files - Githubissues

aminebya1 commented 4 years ago

is it possible to modify the code so he can match many files once

dpwe commented 4 years ago

The matching is intrinsically 1 query against the whole reference set, but each match is quite quick so you can run through a lot of queries in a loop.

The idea of looking for any matches among a large collection is an interesting variant. There might be a more efficient way of structuring that, but it's not trivial.

aminebya1 commented 4 years ago

can u tell me where to modify or give me some hints

vikasmultitv commented 4 years ago

Thanks for this wonderfull code which works very good. In My case I dont know whether a chunk could have 1 or 2 or 3 diffferent file matches. so I am using --max-matches 5 but using this I am getting duplicate matches of the same audio file. could you please help me about that ?

dpwe commented 4 years ago

There's a field, Matcher.max_alignments_per_id, which controls how many matches are returned for each ref item. It's 100 by default and there's no way to control it at present. You could try manually restricting it (line 122 of audfprint_match.py), or you could add a new command-line option and set it as part of audfprint.setup_matcher (around line 316 of audfprint.py). I believe it will keep the most significant matches first. This won't affect the behavior when --exact-count is true.

DAn.

On Wed, Jan 15, 2020 at 4:20 AM vikasmultitv notifications@github.com wrote:

Thanks for this wonderfull code which works very good. In My case I dont know whether a chunk could have 1 or 2 or 3 diffferent file matches. so I am using --max-matches 5 but using this I am getting duplicate matches of the same audio file. could you please help me about that ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/73?email_source=notifications&email_token=AAEGZUIU32HJZBEW2BB47ELQ53IOXA5CNFSM4JZGBYQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI7TWTI#issuecomment-574569293, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUNAZMNK6XN6KDU2L6LQ53IOXANCNFSM4JZGBYQQ .

vikasmultitv commented 4 years ago

Thank You for replying :) Tried the same self.max_alignments_per_id = 100 to self.max_alignments_per_id = 0 and it worked but it miss if there is only 1 match.. I guess I will try some other way around in post processing

vikasmultitv commented 4 years ago

changing if found_this_id > self.max_alignments_per_id to if found_this_id >= self.max_alignments_per_id: Worked fine :) Thank You

dpwe commented 4 years ago

Glad it worked! max_alignments_per_id = 0 seems like an odd choice, I was thinking max_alignments_per_id = 1 (we want at most 1 hit per reference item), but as long as it's doing what you want, it's your fork!

DAn.

On Wed, Jan 15, 2020 at 9:15 AM vikasmultitv notifications@github.com wrote:

changing if found_this_id > self.max_alignments_per_id to if found_this_id >= self.max_alignments_per_id: Worked fine :) Thank You

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/73?email_source=notifications&email_token=AAEGZUJ6OU6C4FG64OKTJ5DQ54LA3A5CNFSM4JZGBYQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJAOH2Y#issuecomment-574677995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUJIY72YIOLFYJTS7RTQ54LA3ANCNFSM4JZGBYQQ .

vikasmultitv commented 4 years ago

I am sorry , its my bad .. actually it is not working ... its missing the match when there is only one match in a chunk.. my chunk is of 1 minute and it can have 2-3 audio files ..

dpwe / audfprint

matching many files #73