Closed feranzie closed 1 month ago
Hi @feranzie, Interesting. When you say 0.17-0.19, do you mean 17% or 0.17%?
About training:
test-query-db-500-30s
and kept the training set (10k songs) unchanged? TR_SPEECH_AUG = True
in the config? This setting is necessary for the speech to be treated as noise.Sanity check:
Great work with the repo. I have a couple of questions regarding using this for my context.
I'm trying to query songs on radio broadcast recordings. I initially trained the model using your data on Kaggle but I got poor results (practically 0.00) during evaluation with my custom data. so I followed your advice in #41 to train the model on my own dataset but since my data is not much ( I have just 3, 1hr 30 mins of audios being split and categorized by music, speech and noise) I just added them to respective folders within your dataset and trained on 20 epochs.
for more context I'm generating fingerprints using the code in #38 and evaluation seems to require dummy_db.mm and dummy_db_shape.npy so I copied both into the logs folder for my checkpoint from the folder where fingerprints were generated while using your test data and it works fine. With the model trained on my data combined with your Kaggle data, evaluation metrics increased insignificantly I was able to get some top 1 exact matches of about (0.17-0,19) I assumed the problem could be from the dummy data used to train the index so I tried using L2 index type and even setting
fake_recon_index = db
as you mentioned in #38 but all evaluation results are still in the range 0.00 to 0.25.Is there anything I am doing wrong or missing out?
Or this repo cannot work for my data type?