dpwe / audfprint

Landmark-based audio fingerprinting
MIT License
544 stars 122 forks source link

I found that some match matches are inaccurate. I want to know, is the time and hash generated by this audfprint continuous at the start-end time, or is it a peak value, which leads to the fact that if one song is 1 minute, another song 3 minutes, there are a lot of hashes in the previous minute in the last two minutes of the next song, which leads to the fact that the hash statistics are larger than one minute, resulting in inaccurate sorting, then the match situation is inaccurate, how to optimize this? #91

Open xuboot opened 1 year ago

xuboot commented 1 year ago

I found that some match matches are inaccurate. I want to know, is the time and hash generated by this audfprint continuous at the start-end time, or is it a peak value, which leads to the fact that if one song is 1 minute, another song 3 minutes, there are a lot of hashes in the previous minute in the last two minutes of the next song, which leads to the fact that the hash statistics are larger than one minute, resulting in inaccurate sorting, then the match situation is inaccurate, how to optimize this?

dpwe commented 1 year ago

Hi, I'm not sure what kind of response you're after. The matching is intrinsically noisy - there will always be some spurious hash matches, which is why the ranking is based on the total number of time-aligned matches. Your questions are pretty general and there are no quick answers. If I knew easy ways to improve the efficiency or accuracy of the system, I would have implemented them already. I'm sure there's room for improvement, but you'll probably need to get deeply into how the code works for yourself.

Sorry I can't be more helpful.

DAn.

On Mon, Oct 24, 2022 at 11:46 PM xuboot @.***> wrote:

I found that some match matches are inaccurate. I want to know, is the time and hash generated by this audfprint continuous at the start-end time, or is it a peak value, which leads to the fact that if one song is 1 minute, another song 3 minutes, there are a lot of hashes in the previous minute in the last two minutes of the next song, which leads to the fact that the hash statistics are larger than one minute, resulting in inaccurate sorting, then the match situation is inaccurate, how to optimize this?

— Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/91, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUOCS2SIKKCJA6RKZWTWE5JX7ANCNFSM6AAAAAARNR6N3Y . You are receiving this because you are subscribed to this thread.Message ID: @.***>

xuboot commented 1 year ago

Thank you, in fact, I mainly do this operation, which is to intercept, merge, and use audfprint in a song or several songs. Raw hash, time, my search logic is to count the number of hashes, sort by count and return to the user, I found that audfprint The generated hash and time are random and not continuous, which leads to this situation audfprint The search is not allowed, thank you very much, audfprint Also this is a good project