worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.36k stars 1.43k forks source link

Multiple identifications in one recording #99

Open richoakley opened 8 years ago

richoakley commented 8 years ago

If there are multiple songs present in a file passed to FileRecognizer, is there a way to specify that all recognitions be returned? Right now, it appears that only the first is returned.

shemul commented 7 years ago

@thesunlover please answer :)

thesunlover commented 7 years ago

@shemul @richoakley that is behaviour by default. you'd better customize a local version for the purpose by yielding the results

ksk438 commented 7 years ago

Did you have any luck with this? I have a similar issue, where only part of the clip should match the key, but should match the entire key (lots of noise before and after).

mauriziocarella commented 7 years ago

I have also the same problem. I have fingerprinted a short audio file and then I concatenated it with two songs like:

But when I analyze the concatenated file the result is only one found match.

thesunlover commented 7 years ago

maurizio96 May I ask why do you need it to work like that? Can't be there a single fingerprint for uniqueness? and counting the times matched you may also make a new function for lookup with a starting timepoint, but usually you are looking for a 6 to 7 seconds record in the big DB

mauriziocarella commented 7 years ago

I need it to work like that because I have 1 hour record file with the short audio played several times (like 4 times per hour) and I would like to get the start position of those times

thesunlover commented 7 years ago

oh, ok, then you might need a new search function with a starting timepoint, I am going to have a look

mauriziocarella commented 7 years ago

Thank you! I hope I explain my problem correctly.. My goal is recognize the positions where my file is placed (inside the big file)

thesunlover commented 7 years ago

The main changes are going to be in find_matches() & align_matches() can you check what is the return from db.return_matches()

mauriziocarella commented 7 years ago

I have focused the functions.. But I don't understand how can I change them to get my goal..

mauriziocarella commented 7 years ago

You think that the problem depend from this code?

if diff_counter[diff][sid] > largest_count: largest = diff largest_count = diff_counter[diff][sid] song_id = sid

I think this cause the only one result in recognition analysis

thesunlover commented 7 years ago

yep, note: I don't have any working copy of the repo so I cannot see what data is coming from the two functions. I just know that in this function

def _recognize(self, *data): matches = [] for d in data: matches.extend(self.dejavu.find_matches(d, Fs=self.Fs)) return self.dejavu.align_matches(matches)

the data consists of one or two channels. so d's should be one or two so you have to match both match data if files are two channels

mauriziocarella commented 7 years ago

With current workink copy of the repo I can get the ShortAudio position by getting the key 'offset_seconds' from recognize result. But unfortunately it lists only one result. I am trying to edit it to get multiple results by change the code:

if diff_counter[diff][sid] > largest_count: largest = diff largest_count = diff_counter[diff][sid] song_id = sid

From what I understand it is limited to one result because of the 'if'

thesunlover commented 7 years ago

it seems to be designated for a single match as the iteration is on the values that are being looked up (by 1k HASHes per single DB-call) note: I am not familiar enough with SQL language, so I am not sure I can help now

mauriziocarella commented 7 years ago

I cannot get it. I see the query "SELECT_MULTIPLE" but it seems to select songs based on HASH file. I have only one record in the DB that matches to the ShortAudio. If I analyze a file with the following structure:

But it (as you say) return only one match but I would like to get the start position of both ShortAudio.

Sorry for many messages but I am trying to be as clear as possible

thesunlover commented 7 years ago

you'd better ask in the stackoverflow how to modify the SQL SEARCH criteria to select multiple pieces from the same file

mauriziocarella commented 7 years ago

Sorry but I cannot understand why I need to edit SQL SEARCH. If the _alignmatches() function is called once how can it return multiple result if the following code is done?

    if diff_counter[diff][sid] > largest_count:
         largest = diff
         largest_count = diff_counter[diff][sid]
         song_id = sid

    song = self.db.get_song_by_id(song_id)
    if song:
        songname = song.get(Dejavu.SONG_NAME, None)
    else:
        return None

    nseconds = round(float(largest) / fingerprint.DEFAULT_FS *
                     fingerprint.DEFAULT_WINDOW_SIZE *
                     fingerprint.DEFAULT_OVERLAP_RATIO, 5)
    song = {
        Dejavu.SONG_ID : song_id,
        Dejavu.SONG_NAME : songname,
        Dejavu.CONFIDENCE : largest_count,
        Dejavu.OFFSET : int(largest),
        Dejavu.OFFSET_SECS : nseconds,
        Database.FIELD_FILE_SHA1 : song.get(Database.FIELD_FILE_SHA1, None),}
    return song

The song_id is replaced every time the 'if' condition is true and it return only one object for match. How can it return multiple songs object. I am studying the SQL build script but it is more complicated...

Note: you rare helping me a lot :+1: :smile: thank you!

atlantis0 commented 7 years ago

I have a very similar use case, The method align_matches(self, matches) is very interesting. Can anyone shade some light how this selection works? If so, how could we match multiple results?

Especially, the diff variable which is obtained from db?

spotranking786 commented 6 years ago

any luck with this ?

alonek1 commented 6 years ago

same problem here, any idea for finding multiple matches in a file or from microphone ?

foxbit commented 6 years ago

Same problem, i need to identify the time of multiple matches in the file. Any solution?

fabriziocarraro commented 6 years ago

Did anybody manage to have multiple results? When I try to recognise a song I would like to get many results in decrescent number of confidence. In this way I can find if a short song sequence is inserted in more than a song in the library. I'm trying to modify "def align_matches(self, matches)" but not working yet.

ghost commented 6 years ago

This will return all matches sorted by confidence.

def align_matches(self, matches):
    """
        Finds hash matches that align in time with other matches and finds
        consensus about which hashes are "true" signal from the audio.

        Returns a dictionary with match information.
    """
    # align by diffs
    diff_counter = {}
    song_ids = {}

    for tup in matches:
        sid, diff = tup
        if diff not in diff_counter:
            diff_counter[diff] = {}
        if sid not in diff_counter[diff]:
            diff_counter[diff][sid] = 0
        diff_counter[diff][sid] += 1

    for diff in diff_counter:
        for sid in diff_counter[diff]:
            if sid not in song_ids:
                song_ids[sid] = [0, '']
            if diff_counter[diff][sid] > song_ids[sid][0]:
                song_ids[sid][0] = diff_counter[diff][sid]
                song_ids[sid][1] = diff

    songs_detailed = []
    for song_id in song_ids:
        confidence, offset = song_ids[song_id]
        # extract idenfication
        song = self.db.get_song_by_id(song_id)
        if song:
            nseconds = round(float(offset) / fingerprint.DEFAULT_FS *
                     fingerprint.DEFAULT_WINDOW_SIZE *
                     fingerprint.DEFAULT_OVERLAP_RATIO, 5)
            songs_detailed.append({
                            Dejavu.SONG_ID : song_id,
                            Dejavu.SONG_NAME : song.get(Dejavu.SONG_NAME, None),
                            Dejavu.CONFIDENCE : confidence,
                            Dejavu.OFFSET : int(offset),
                            Dejavu.OFFSET_SECS : nseconds,
                            Database.FIELD_FILE_SHA1 : song.get(Database.FIELD_FILE_SHA1, None)})

    return sorted(songs_detailed, key=lambda x: x[Dejavu.CONFIDENCE], reverse=True)

Remove this line from recognize.py->recognize_file function

if match:
    match['match_time'] = t

or replace with

for m in match:
    m.update({'match_time': t})