Restore recommendations based on acoustic similarities

epoupon commented 1 year ago

See #221

kmod-midori commented 1 year ago

The blog post linked in the original issue as well as some comments said that they're not really into the idea of acoustic-based recommendations (the AB team tried and failed), I wonder if we really should continue done this route...

Danoloan10 commented 1 year ago

I think this is the comment you are talking about: https://blog.metabrainz.org/2022/02/16/acousticbrainz-making-a-hard-decision-to-end-the-project/#comment-352090

The problem with the tags-based clustering is that you need a reliable source of tags for it to be useful. If the source is not reliable, iPod whiplash will be produced regardless. What could be a possibility for that? beets taggers could be an alternative, but some are unreliable, like the Deezer one (I downloaded my music from Deezer and the tagging is quite bad).

epoupon commented 1 year ago

Yes indeed this is a difficult subject and the way it is done may be wrong. But lms does not exactly has the same goals as AB. Here we just want to help the user find similar music in his local library. There is no high level genre guessing or classification on unknown tracks. Well I still want to put some efforts in this, let's see how it will go!

Danoloan10 commented 1 year ago

Is the current Features engine taking into account the genre tags of the tracks?

If I recall correctly, there are the following use cases regarding genre/mood tagging:

No tags in the tracks
Unreliable or non-homogeneous tags in the tracks
Reliable and homogeneous tags

By homogeneous I mean that all tracks have the same number of genres and that there are no duplicate genres (Pop/Rock vs Rock Pop).

So we need an engine that fully uses similarity indexes when there are no tags or they are unreliable, and a combination of both when the genre tags are reliable. By combining already existing genre tags with low-level acoustic data the recommendations may be far better than with just the former (Clusters engine) or the latter (the problem AB had).

One way could be to define a "genre similarity index" and add it as a feature in the feature vector. Another way could be to create a heuristic combining the "genre similarity" with the "acoustic similarity", like (A + G) · X or (A · X + G · Y), where X and Y are random.

Danoloan10 commented 1 year ago

I'm trying beets' lastgenre tagger, which uses last.fm genre data. It's good because you can fix the number of genres for each album, so that the tags will be as homogeneous as possible. The problem is that for some spare uncommon tracks there is no genre data from last.fm's side; it wouldn't be correct to isolate those from the recommendations. So there is a fourth use case:

Mostly tagged library with spare non-tagged albums

epoupon commented 1 year ago

Currently the similar results of the recommendation engine is either taken from the audio acoustic similarities (if at least a result was found) or from the tags (fallback). Indeed we could imagine combining both sources to give better results as you suggest. And indeed the current tags based engine is biased when tracks are not homogeneously tagged (but I think it is another problem)

epoupon / lms

Restore recommendations based on acoustic similarities #301