marin-m / SongRec

An open-source Shazam client for Linux, written in Rust.
GNU General Public License v3.0
1.35k stars 102 forks source link

Option to ignore songs that last less than a minimum number of seconds #105

Open rocaej opened 2 years ago

rocaej commented 2 years ago

Hi! I mentioned this as a comment in another issue but figured it might be worth it's thread.

Right now, SongRec breaks down the input stream into 12 second slices, and samples the middle of each one to recognize a song (this is what I understand from algorithm.rs, although microphone_thread.rs seems to sample every 4 seconds, not the middle).

While this works great most of the time, my particular use case is that I'm trying to recognize songs from online radios. Commercials in the radio tend to have background music, which is being recognized by SongRec.

As I understand from @marin-m's comments in Issue 35, SongRec already discards songs that are recognized more than once in a row (although I haven't found the code for this, yet).

It seems to me that a possible solution would be to add an option for SongRec to ignore any song that isn't detected at least 2 or 3 times in a row*, and only after a song is detected 3 times pass the recognition result. After that, further repeat samples could be discarded as they are right now.

*I say 2x or 3x since commercials do not usually last more than 30 seconds, but this could theoretically be any arbitrary number.

Hopefully this makes some sense. I've been looking at the code to try and hack an implementation myself, but I can't find where SongRec is comparing the different samples and ignoring the ones that are identical. Feel free to point me towards that and I'll look into it!

Xsmael commented 1 year ago

Hi! I mentioned this as a comment in another issue but figured it might be worth it's thread.

Right now, SongRec breaks down the input stream into 12 second slices, and samples the middle of each one to recognize a song (this is what I understand from algorithm.rs, although microphone_thread.rs seems to sample every 4 seconds, not the middle).

While this works great most of the time, my particular use case is that I'm trying to recognize songs from online radios. Commercials in the radio tend to have background music, which is being recognized by SongRec.

As I understand from @marin-m's comments in Issue 35, SongRec already discards songs that are recognized more than once in a row (although I haven't found the code for this, yet).

It seems to me that a possible solution would be to add an option for SongRec to ignore any song that isn't detected at least 2 or 3 times in a row*, and only after a song is detected 3 times pass the recognition result. After that, further repeat samples could be discarded as they are right now.

*I say 2x or 3x since commercials do not usually last more than 30 seconds, but this could theoretically be any arbitrary number.

Hopefully this makes some sense. I've been looking at the code to try and hack an implementation myself, but I can't find where SongRec is comparing the different samples and ignoring the ones that are identical. Feel free to point me towards that and I'll look into it!

That's very interesting, i'd suggest that we use parameters that can be easily tweaked to suit one's needs instead of hardcoding these numbers.