Are embeddings from "new" model better than those from the previous model?

teticio / Deej-AI

Create automatic playlists by using Deep Learning to *listen* to the music.

GNU General Public License v3.0

328 stars 50 forks source link

Are embeddings from "new" model better than those from the previous model? #109

Open RomaKoks opened 2 months ago

RomaKoks commented 2 months ago

As far as I know, you have a validation set. Have you compared the performance of both models on this set?

I'm curious because, in my opinion, the playlists generated from the previous (5-year-old) model's embeddings are better.

teticio commented 2 months ago

Interesting. It's a little difficult to evaluate because it is very subective, but I did have a test set for the mp3tovec embedding training. How that translates to a better or worse playlist is another question. I think the larger universe (3x more tracks) makes it more of a challenge, but happy to try to dig into this more.

teticio commented 2 months ago

Out of interest, which embeddings (track2vec or mp3tovec) are you referring to particularly? I was looking for the training logs just now but I'm not sure I kept them :-(

teticio commented 2 months ago

It's not possible to directly compare the new mp3tovec model with the old one, because they were each trained to predict the track2vec embedding, which were different in each version (so the task is different). I didn't use a validation set to train the track2vec as it wasn't clear how to do that, so I trained it subjectively until I felt it had converged based on top n similar songs for a bunch of tracks I know well that cover several genres.

RomaKoks commented 2 months ago

I'm talking about mp3tovec, which uses MEL spectrograms as input. If we can create an evaluation set consisting of three items, and we want to determine which two of them are most similar, then we can calculate various classification metrics and some distance-based metrics such as the cosine distance gap.

RomaKoks commented 2 months ago

And I have a question unrelated to the topic:

Do you have precomputed MEL spectrograms for the tracks used in training the new model? I want to try retraining the model, but I don't want to download the tracks because it takes a long time. And calculating the MEL spectrograms is also a time-consuming task.

teticio commented 2 months ago

Hi. So I did use a validation set for the training of the mp3tovec (to avoid overfitting). I don't have the results around any more though. :-(. I also didn't keep the downloaded mp3s or MEL spectrograms as they take a up a lot of space. It is time consuming, but if you have a few cores on your machine it isn't too bad.