regorxxx / Search-by-Distance-SMP

An implementation of Music-Graph for foobar2000, using Spider Monkey, which creates intelligent "spotify-like" playlist using high-level data from tracks and computing their similarity using genres/styles.
https://regorxxx.github.io/foobar2000-SMP.github.io/scripts/search-by-distance-smp/
GNU Affero General Public License v3.0
35 stars 7 forks source link

[FEATURE] Custom graph for custom tag #18

Open username116 opened 11 months ago

username116 commented 11 months ago

I'd like to use custom graphs with Search-by-Distance custom tags. To use genres from online music databases. For example a Discogs graph with Discogs tags.

I've made graphs here. It's a start, they can be improved.

I could make substitutions in the descriptor files for graphs that aren't complex. You've done it for AllMusic. But it seems difficult to merge a large graph with the original one.

regorxxx commented 11 months ago

Having a different genre/style tag associated to a totally independent graph would require a big rework of the current implementation on Foobar2000.

Not really that big just for duplicating the actual graph code just for new ones, which is somewhat trivial... but to make it work properly then you need to duplicate all current graph settings per "custom graph". Which would mean the script would be so complex no one would know what to do with it xd

Also the different graphs are somewhat incompatible between them. RYM and MusicBrainz have totally different tagging rules, which would mean results would be worse (and not better). Either you stick to a graph with a lots of nodes, or a more abstract one like RYM or discogs, but mixing both in different tags would simply pollute results with things which are neither "more similar" nor "more different".

Now my take on this is I don't generally agree with more tags being better. I prefer quality over quantity.

  1. There is a lot of overlap between discogs, allmusic and current graph (which can be taken care with substitutions).
  2. A lot of current tagging is simply wrong:
    • Tags are too abstract. (for Discogs, Deat Metal is rock like Pop Punk)
    • Arbitrary tags set: there are thousands of releases tagged as X just because there is nothing better or as a general genre for the entire album (which is false in many cases).
    • Cuestionable genres, which are folksonomy tags. Children's music is not a specific genre, it may be folk, may be classical music (found at movies), may be spoken words, poetry, songs, ... may be anything. Is not a musical genre, is a folksonomy tag, which is fine if you use as a custom tag. But there is clearly no objective relation between Children's music and Folk for ex.
    • Cultural folksonomy tags as genres. There are thousands of genres which are just regional variants of a given genre, with no specific and homogeneous differences.
  3. Most releases on MusicBrainz have no tags (even if their genre map is much better than the rest); many releases have arbitrary tags set with no real care between tracks.
  4. There is also need to create links between those genres. Like... Pop being closer to Rock than to Classical music. I know your examples are WIP but note the current way you have set them, all genres are equal in distance as long as they are in different clusters. image
  5. You are using a central cluster for that, while I considered those big clusters where in a "circle". Industrial -> Metal -> Rock -> Pop -> R&B -> Blues... (without a central point, just each cluster being linked to their neighbors)

I think for discogs and Allmusic the best approach is simpy adding substitutions.

regorxxx commented 11 months ago

Now for MusicBrainz and RYM, I see what you mean with the merging problem xd

My aim is adding more and more genre/styles (which make sense) so at some point any other graph should be covered by this one (another thing is the work required to make substitutions). image You can use the button to help with that... let's say yo add DISCOGS_GENRE to a custom tag, as a 'graph' tag. Then use that and it will report all genres not found (which would require a subsitution).

Obviously if you can code... you can speed it up by comparing both lists with regexp and automatically finding close matches.

regorxxx commented 11 months ago

Finally, while I think ALLmusic and discogs can be totally added as substitutions, I don't see a problem simply switching the current graph with a RYM graph or a MusicBrainz graph. i.e. a total subsitution of the current graph, not having multi-graphs.

That's much easier to do, and the user would have to choose which one to use according to their tagging habits. Since you can have multiple buttons, nothing stops you to set one graph type per button in case you wanna use multiple tag structures.

username116 commented 11 months ago

Also the different graphs are somewhat incompatible between them. RYM and MusicBrainz have totally different tagging rules, which would mean results would be worse (and not better).

Custom tag slots allow you to separate tags, that's what I've done in my settings. This way, if I've understood correctly, DISCOGS_GENREs will only be compared with each other, DISCOGS_STYLEs will only be compared with each other, etc.

I agree with everything you say about the shortcomings of online databases. But for the moment it's more homogeneous than my personal tags. And it's one more way of finding similar tracks.

You can use the button to help with that...

Thanks for the advice. I should do that and make the substitutions.

I know regexp, the easy ones. But I don't know javascript.

Since you can have multiple buttons

I hadn't thought of multiple buttons, with different graphs for each. Yes, that's close to what I'm looking for. I imagine I'll then have to mix the playlists manually, taking the best results from each side. I'm thinking of another button or menu that could sort the results of two or three buttons... as there are scores in the console, they could be put together, remove the duplicates and keep the 50 best scores?

nothing stops you to set one graph type per button

There's only one _cacheLink file? I don't know.

regorxxx commented 11 months ago

Custom tag slots allow you to separate tags, that's what I've done in my settings. This way, if I've understood correctly, DISCOGS_GENREs will only be compared with each other, DISCOGS_STYLEs will only be compared with each other, etc.

One thing is string comparison (weight for scoring), and another thing is the graph part (distance). All tags which have the 'graph' type are compared at the graph with no consideration to the source they come from. For scoring and weight, only tag values within the same slot are compared. image image

So in fact... all 'graph' type tags are used no matter where you put them. Which is the most logical design, since people's tags, discogs' tags, allmusic's tags, etc. has no real distintion between genre and style. I allow them to be split into 2 different tags, for comparison purpose of values, but on the graph they are treated as the same entity (you can also add that type to any new tag, or a folksonomy tag or whatever).

I agree with everything you say about the shortcomings of online databases. But for the moment it's more homogeneous than my personal tags. And it's one more way of finding similar tracks. I agree with that BUT, then maybe this tool +multigraphs is not the tool to find similar tracks! Not sure if you get my point. It may make more sense to simply apply specific graphs to specific tagging styles by different databases. Mixing them all at the same time does not give better results.

That's why I offer the possibility to add multiple customizable buttons, each one with different configuration. Then every button could load a different graph. There are also recipes for that, which only involve 2 clicks to totally switch configuration.

I'm thinking of another button or menu that could sort the results of two or three buttons... as there are scores in the console, they could be put together, remove the duplicates and keep the 50 best scores? That's somewhat doable with the current design (and something I have already considered to create a playlist similar to another playlist). Executing the analysis multiple times and joining the results in some way it's on my todo list.

About the current state of things, you can already do that. Since you can set the sorting by score, instead of manually. You can easily take 10 first tracks from those playlists. Also Playlist Tools has a random pools feature which allow to run Search by Distance searchs under the hood and mix them (with different settings).

regorxxx commented 11 months ago

There's only one _cacheLink file? I don't know.

Currently, only one. Shared for all buttons. I would have to change that part to add the graph type to the filename. Missed that.

There are also memory implications with all this, since foobar2000 is currently limited by SMP not having an x64 build.

username116 commented 11 months ago

Thanks for the explanations. I didn't really understand the calculation method. I stupidly thought that the slots were separate. I also reread the documentation (here and there). It helped me understand better.

So in fact... all 'graph' type tags are used no matter where you put them. Which is the most logical design, since people's tags, discogs' tags, allmusic's tags, etc. has no real distintion between genre and style. I allow them to be split into 2 different tags, for comparison purpose of values, but on the graph they are treated as the same entity

There are differences on Discogs and Rate Your Music:

I don't know whether this justifies a change in the distance calculation (mixing all the 'graph' values, or calculating them independently by slots).

It may make more sense to simply apply specific graphs to specific tagging styles by different databases. Mixing them all at the same time does not give better results. That's why I offer the possibility to add multiple customizable buttons, each one with different configuration. Then every button could load a different graph. There are also recipes for that, which only involve 2 clicks to totally switch configuration.

Yes, I didn't quite understand the calculation method. Yes, it seems better to search for a database tags only in the corresponding graph. Do not mix everything. With the current calculation method (all 'graph' values mixed), I agree with a graph setting by button or by recipe. I would be happy to have this new feature.

Since you can set the sorting by score, instead of manually. You can easily take 10 first tracks from those playlists. Also Playlist Tools has a random pools feature which allow to run Search by Distance searchs under the hood and mix them (with different settings).

Yes I was thinking about that, I already use sorting by score. I tried Playlist Tools and saw the 'Custom pool' command. That's what I was looking for. I thought of a few things about sorting and macros, but will have to see later.

username116 commented 9 months ago

I made a second button with another graph, using other folders js_data and xxx-scripts, like this.