YetAnotherNerd / whatlastgenre

Improve genre metadata of audio files based on tags from various music sites.
http://yetanothernerd.github.io/whatlastgenre
MIT License
78 stars 20 forks source link

CSV metadata tags that are not ID3v2.3 #5

Closed ctlaltdefeat closed 9 years ago

ctlaltdefeat commented 9 years ago

When there are multiple artists (separated by semi-colon), the album basically never gets found.

YetAnotherNerd commented 9 years ago

Tags should be multi-valued instead of CSV, is this ID3v2.3 related?

ctlaltdefeat commented 9 years ago

No, I have ID3v2.4 tags. It could be I'm not describing this accurately, but for example: https://what.cd/torrents.php?id=72802865 According to MusicBrainz this would have two artists, but it can't be found on what.cd (probably due to the "performed by" thing).

YetAnotherNerd commented 9 years ago

that url doesn't help me reproduce your problem. what are the contents of the tags/metadata of your problematic album? which searchstring is wlg creating from it (as shown in the output)?

ctlaltdefeat commented 9 years ago

The created string is "johann sebastian bach; angela hewitt the art of fugue". The issue is with the semi-colon, without it, it works. My stuff is tagged with beets (using the default ID3v2.4 option) with the appropriate musicbrainz album.

YetAnotherNerd commented 9 years ago

mh, so just removing the semicolon might work on whatcd, but might not work on other sources since "johann sebastian bach angela hewitt" is generally not a valid artist name. however, semicolons might not be valid searchstring characters (for sources other then mbrainz, but mbrainz might use mbid search in this case), so i think they should be removed anyway i think the right approach is to just split the CSV data like it should be done with id3v23 separated tags

some possible scenarios for reference: https://gist.github.com/YetAnotherNerd/dac85807b63c2e1c2d2f

ctlaltdefeat commented 9 years ago

Yeah, probably. It's worth noticing that in this case, all scenarios lead to fairly similar tags.