mimbres / neural-audio-fp

https://mimbres.github.io/neural-audio-fp
MIT License
175 stars 25 forks source link

Fingerprint generation from custom dataset #39

Closed AntonisKarnavas closed 11 months ago

AntonisKarnavas commented 1 year ago

Hi, first of all great work. I am trying to use this model to generate some fingerprints for a single movie audio (let's say 2 hours long). I trained the model, following your instructions, and for the -s argument in the python run.py generate command I used the directory containing the movie audio. Then the fingerprints are generated, all good. I then proceed to read the produced memmap, I convert it into a numpy array and it's of size (16353,128) with duration=1 and hop=0.5 as parameters. All good until now. However, when I inspect the produced fingerprints, they are repeated every 125 fingerprints, meaning that it produces 16353/125=131 batches which are the same. Why is that? Don't you crop the audio file into segments and then pass it through the model, or should I do that beforehand? Thanks!

mimbres commented 11 months ago

Hi @AspiringDeveloper00, sorry for the late reply. I currently don't have the environment set up to run the code. However, based on my guess, this part of the code is related to the repeating behaviour you mentioned.

https://github.com/mimbres/neural-audio-fp/blob/058d812df3787a7e000c6f595e200fd2e15ee348/model/dataset.py#L312

_ts_n_anchor should be set the same as 125, assuming your test batch size defined in config was 125. If _ts_n_anchor is 1, it will create the same clones (of single segment) in the batch.

It's been too long since I tried using a custom dataset, so I can't remember well. I hope this provides some help in solving the problem.

AntonisKarnavas commented 11 months ago

Hey, thanks for answering. I figured it out, it was because I transferred the checkpoints to another machine to generate the fingerprints there and possibly I came across a bug. I retrained the model on the new machine and everything is good!