Closed ghost closed 10 years ago
Well, yes but the plugin has no way to know that unless you tell him.
Copy https://github.com/sampsyo/beets/blob/master/beetsplug/lastgenre/genres-tree.yaml on your hard drive, and edit it so that Alt. Rock
and Alternative Rock
are children of Alternative-Rock
.
Use that file for the canonicalization option + make use of the whitelist option.
ps: I wrote about that genres canonicalization feature on my blog
This does not work for me. I did beet lastgenre -f beet move Still have the folders
27 lastgenre:
28 force: yes
29 canonical: /home/markus/.config/beets/genres-tree.yaml
30 whitelist: /home/markus/.config/beets/genres.txt
genres-tree.yaml
323 - rock:
324 - alternative rock:
325 - alt. rock
326 - alternative-rock
genres.txt
23 alternative rock
24 alt. rock
25 alternative-rock
Are you sure these genres are not already present on the files? Lastgenre will, by default, not change the existing genre if it can't find a suitable genre for a given album.
Set the "fallback" option if you instead want to clear these genres.
To help see what's going on, it would be helpful to see the output of the lastgenre
command, which indicates the source for each genre it applies.
Closing for now, since the user seems to have disappeared.
Hi. I have exactly the same problem, but I found out, that with the command chain:
beet lastgenre (-f) beet move
beet did not move the files, but copied them. My Problem was that canoncicalisation did not work with the default entry (canonical: '') for some reason. So I had to copy the genres-tree.yaml file and put it in (canonical: '/path/to/file'). After that, I ran beet lastgenre and could verify that the obscure genres dissappeared and had replaced by other, more generic ones.
Then I tried to move the files with beet move, but now I have duplicated any album, which had different genre entries before and after the lastgenre command.
This is more or less a little desaster, because now I have to find all the duplicates and remove them.
@marlemion Are you saying that "beet move" copies files every time, even without the copy flag? That's very troublesome and a separate issue. Can you open another bug describing your setup?
Sorry. False alarm. I moved the entire collection to another directory via beet move and it didn't leave anything in the old directory. So no duplication.
But I reckoned why I actually thought it would have been duped. I want canonicalization with lastgenre and put the albums in directories with the genres as name. However, canonicalization does not work. I have tried canonical: '' and I have tried to use a separate file as indicated above.
As I am on Gentoo, beets ist at 1.3.3 right now. Might be the version the problem?
My config for lastgenre:
lastgenre: whitelist: /usr/lib/python2.7/site-packages/beetsplug/lastgenre/genres.txt min_weight: 10 count: 1 fallback: Unknown canonical: /usr/lib/python2.7/site-packages/beetsplug/lastgenre/genres-tree.yaml source: 'artist' force: True auto: True
genres-tree.yaml:
beet -v lastgenre -f galliano:
data directory: /home/XXXX/.config/beets library database: /mnt/sda/Music/musiclibrary.blb library directory: /mnt/sda/Music lastfm.tag (min. 10): acid jazz [100] genre for album Galliano - : 4 (artist): Acid Jazz Sending event: database_change Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: write Sending event: cli_exit
What's wrong?
Well, I tracked it down in the plugin: in the function 'find_parents' the script searches for candidate chains including the obtained genre. In my example this is:
['electronic', 'electronica', 'downtempo', u'acid jazz']
After that, it tries to get the parent from this chain. What it is actually doing in function '_strings_to_genre' is to get one genre and check whether it is in the whitelist (_is_allowed(parent)) and if it is, it breaks.
However, it finds acid jazz at first and then breaks. This is not how it should work, is it?
Does it mean I would have to remove Acid Jazz from the Whitelist?
Removing acid jazz from the whitelist resulted in the lastfm tag to be chosen as 'funk' which is apparently the next tag offered by lastfm. So blacklisting is not the solution. There is something fishy with canonicalization and I am puzzled that nobody encounters it but me.
Sorry for spamming btw.
When I use 'reversed(find_parents(XXX)), it grabs the generic genre before the existing.
Aha—do you think reversing the list is the right solution? If so, do you think you can open a pull request? This might be simple enough that the in-browser "edit" button could suffice to make the PR.
Because the c14n is supposed to return the most accurate whitelisted genre, I don't think reversing the list will work in the general case (ie multiple whitelisted genres along the branch).
Can you give me the album name so I can try to reproduce the pb ?
Yeah, right, reversed order would always result in the very basic genres. Is there a way to adjust the level of genres? As far as I understood the genres-tree.yaml file, there are several layers of genres, e.g. very basic, basic, less basic, specific, more specific, ridicilous specific. If I would like to have the 'basic' genres only, because very basic would be too generic and less basic too specialized, is there any way to achieve this with the current implementation?
The link to the album at discogs: http://www.discogs.com/Galliano-4/release/166950 and on musicbrainz: http://musicbrainz.org/release/e47312ec-43b2-4efa-a7af-a6a37a8be5d5
Thanks for your help, guys, beets rocks. :)
Whether you qualify a genre "basic" or "ridiculous specific" is subjective and depends mainly of how deep you are into a given genre ;) Someone who have thousands of jazz files may want to distinct "acid jazz" and, say, "chill out". It's likely that you want to be more specific in one genre area than another, that's why there is not a global genre-acurateness option.
The use case I had in mind when coding c14n is that you have the given set of genres of your library and you want to make sure any new file imported will use one of those.
To achieve this, you edit genres-tree.yaml
so that each genre become the root of the branch it belongs to. Because you enforced only one allowed genre per branch (the root) this tree structure permits to funnel non-allowed genres into allowed genres.
I wrote a blog post about that issue a while ago, you may want to read it, as it has some illustrations that hopefully make my reasoning more clear.
From you config i see :
whitelist: /usr/lib/python2.7/site-packages/beetsplug/lastgenre/genres.txt
Have you edited the file ?
By default it contains all genres but you want to strip the content to list only the genres you want to handle in your library.
eg if the file contains only electronica then the algo will do acid jazz => downtempo => electronica and use the latter.
Tried this. See my 4th post. It does not follow your described behavior. Unfortunately...
I will have to think about modifying the tree-fil as suggested in your blog post, but I still don't see the use for the different layers in the tree file then. Anyway, I'll be away until next Tuesday, so this will have to wait. :(
HI, it is me again. I have followed your advise and modified the genres.txt to include some very generic genres, for example 'dance' and 'hip hop & rap'.
It works quite ok right now, but still there are issues I don't understand. For example, I try to get the right genre for 'ASD', a German rap combo. But instead, it sorts it into 'dance'. I have modified the lastgenre plugin code to be more verbose, especially to show all tags and weight such as read from the last.fm webpage:
res_tag: deutschrap (100) res_tag: hip hop (100) res_tag: hip-hop (97) res_tag: german (89) res_tag: rap (78) res_tag: german hiphop (51) res_tag: samy deluxe (19) res_tag: german hip hop (14) res_tag: german hip-hop (14) res_tag: afrob (12) res_tag: deutsch (12) res_tag: hamburg (12) res_tag: deutscher hip hop (10) res_tag: asd (10) res_tag: deutsch rap (10) res_tag: hiphop (10) res_tag: german rap (10) res_tag: hamburgs finest (8) res_tag: deutscher sprechgesang (6) res_tag: german artists (6) res_tag: hamburg hip-hop (6) res_tag: old school (6) res_tag: stuttgart (6) res_tag: deutscher hiphop (4) res_tag: geht ab (4) res_tag: kolchose (4) res_tag: stuttgart hip-hop (4) res_tag: splash festival (4) res_tag: kultrap (4) res_tag: hip hop mit niveau (4) res_tag: deutsch-hip-hop (4) res_tag: deutscha hip-hop (4) res_tag: rocken roll : (4) res_tag: germany (4) res_tag: real shit (4) res_tag: male vocalists (4) res_tag: german bands (4) res_tag: hamburger schule (4) res_tag: smooth (4) res_tag: oldschool german hip-hop (4) res_tag: since (4) res_tag: italian trash (2) res_tag: no more war (2) res_tag: party (2) res_tag: band (2) res_tag: dance (2) lastfm.tag (min. 15): dance [2] tags: [u'dance'] tag: dance
Here you can see the problem: the first 4 items of tags are all rated very high, but still, it is sorted into 'dance', which is the last on the list. I have set min_weight > 2 (actuall 15), but it still sorts it in dance. hip-hop or hip hop should be funneled into 'hip hop & rap':
Any thought on this? It definitely does not behave as it should. 'dance' and 'hip hop' are even the same level, 'hip hop & rap' is one level above dance. I don't get the logic of choosing.
Shouldn't be the logic the following:
Grab the first tag with the highest weight from lastfm. Check whether it is in the genres-tree. If not, skip it. If yes, check, if it is in the whitelist. If yes, set it. If no, funnel it down (one level or more levels). Check whether the funneled genre is in the whitelist. If yes, set it, if not, either funnel further down, if possible, or go back to the top and choose the next tag provided by lastfm.
This logic should find the first match with the whitelist and the highest weight, which is canonicalized, if necessary.
Does it work like this? If I go through the code, it just throws out any tag, which is not in the whitelist. After that, it thinks about canonicalizing, if needed. In my case, it just finds 'dance' and is happy. But this is not how it should be. It should find 'hip hop & rap'.
When I whitelist e.g. hip hop, it looks like this:
res_tag: deutschrap (100) res_tag: hip hop (100) lastfm.tag (min. 15): hip hop [100] tags: [u'hip hop'] tag: hip hop genre for album ASD - Wer hätte das gedacht? (artist): Hip Hop
You see, what it does: it checks, if 'deutschrap' is in the whitelist - it is not: skipped. Then it goes on with 'hip hop' - it is. So it goes on with its (hopefully checks for the weight, even though there is a 'min_weight > weight' statement included, which I don't get). Then it sets 'hip hop'.
But: I have canonicalization enabled! Thus, It does either not canonicalize correctly (better: at all) but just uses the whitelisted tags.
In my opinion, the code is broken. It does not, what it is supposed to do.
I have modified the plugin to follow the logic as set out above (e.g. canonicalize down, if it is not in the whitelist and then check again):
def _tags_for(obj):
"""Given a pylast entity (album or track), returns a list of
tag names for that entity. Returns an empty list if the entity is
not found or another error occurs.
"""
try:
# Work around an inconsistency in pylast where
# Album.get_top_tags() does not return TopItem instances.
# https://code.google.com/p/pylast/issues/detail?id=85
if isinstance(obj, pylast.Album):
res = super(pylast.Album, obj).get_top_tags()
else:
res = obj.get_top_tags()
except PYLAST_EXCEPTIONS as exc:
log.debug(u'last.fm error: %s' % unicode(exc))
return []
tags = []
min_weight = config['lastgenre']['min_weight'].get(int)
count = config['lastgenre']['count'].get(int)
dbg = []
for el in res:
weight = int(el.weight or 0)
tag = el.item.get_name().lower()
log.debug(u'res_tag: {0} ({1})'.format(
tag,weight))
if _is_allowed(tag):
if min_weight > -1 and min_weight > weight and len(tags) > 0:
return tags
tags.append(tag)
dbg.append(u'{0} [{1}]'.format(tag, weight))
if len(tags) == count:
break
for parent in find_parents(tag, options['branches']):
log.debug(u'res_tag:canon {0})'.format(
parent))
if _is_allowed(parent):
tags.append(parent)
dbg.append(u'{0} [{1}]'.format(tag, weight))
if len(tags) == count:
break
log.debug(u'lastfm.tag (min. {0}): {1}'.format(
min_weight, u', '.join(dbg)
))
return tags
Now it does it correctly:
res_tag: deutschrap (100) res_tag:canon deutschrap) res_tag: hip hop (100) res_tag:canon hip hop) res_tag:canon hip hop & rap) res_tag: hip-hop (97) res_tag:canon hip-hop) res_tag: german (89) res_tag:canon german) res_tag: rap (78) res_tag:canon rap) res_tag:canon hip hop & rap) res_tag: german hiphop (51) res_tag:canon german hiphop) res_tag: samy deluxe (19) res_tag:canon samy deluxe) res_tag: german hip hop (14) res_tag:canon german hip hop) res_tag: german hip-hop (14) res_tag:canon german hip-hop) res_tag: afrob (12) res_tag:canon afrob) res_tag: deutsch (12) res_tag:canon deutsch) res_tag: hamburg (12) res_tag:canon hamburg) res_tag: deutscher hip hop (10) res_tag:canon deutscher hip hop) res_tag: asd (10) res_tag:canon asd) res_tag: deutsch rap (10) res_tag:canon deutsch rap) res_tag: hiphop (10) res_tag:canon hiphop) res_tag: german rap (10) res_tag:canon german rap) res_tag: hamburgs finest (8) res_tag:canon hamburgs finest) res_tag: deutscher sprechgesang (6) res_tag:canon deutscher sprechgesang) res_tag: german artists (6) res_tag:canon german artists) res_tag: hamburg hip-hop (6) res_tag:canon hamburg hip-hop) res_tag: old school (6) res_tag:canon old school) res_tag: stuttgart (6) res_tag:canon stuttgart) res_tag: deutscher hiphop (4) res_tag:canon deutscher hiphop) res_tag: geht ab (4) res_tag:canon geht ab) res_tag: kolchose (4) res_tag:canon kolchose) res_tag: stuttgart hip-hop (4) res_tag:canon stuttgart hip-hop) res_tag: splash festival (4) res_tag:canon splash festival) res_tag: kultrap (4) res_tag:canon kultrap) res_tag: hip hop mit niveau (4) res_tag:canon hip hop mit niveau) res_tag: deutsch-hip-hop (4) res_tag:canon deutsch-hip-hop) res_tag: deutscha hip-hop (4) res_tag:canon deutscha hip-hop) res_tag: rocken roll : (4) res_tag:canon rocken roll :) res_tag: germany (4) res_tag:canon germany) res_tag: real shit (4) res_tag:canon real shit) res_tag: male vocalists (4) res_tag:canon male vocalists) res_tag: german bands (4) res_tag:canon german bands) res_tag: hamburger schule (4) res_tag:canon hamburger schule) res_tag: smooth (4) res_tag:canon smooth) res_tag: oldschool german hip-hop (4) res_tag:canon oldschool german hip-hop) res_tag: since (4) res_tag:canon since) res_tag: italian trash (2) res_tag:canon italian trash) res_tag: no more war (2) res_tag:canon no more war) res_tag: party (2) res_tag:canon party) res_tag: band (2) res_tag:canon band) res_tag: dance (2) tags: ['hip hop & rap', 'hip hop & rap'] tag: hip hop & rap tag: hip hop & rap genre for album ASD - Wer hätte das gedacht? (artist): Hip Hop & Rap
What do you think?
Seems you've put a lot of time debugging this and even have a working code, great ! I'm not familiar with the lastfm weighings that have been introduced recently, but you're right it seems to have broken the c14n logic. Would you mind submit your code in a PR to facilitate the review ? thks. :rabbit:
I second the pull request suggestion! :hatched_chick:
Done. Thanks for your help! https://github.com/sampsyo/beets/pull/706
I'm using the lastgenre plugin. This seems to have a bug I have genres Alt. Rock, Alternative Rock and Alternative-Rock Isn't this all the same?