beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.81k stars 1.82k forks source link

lyrics: In Genius backend, tolerate artist disambiguation markers #4791

Open ronajon opened 1 year ago

ronajon commented 1 year ago

Problem

importing Genius lyrics for specific Artists does not work. The reason is that the artists are not know in Genius with just their bandname (Psychonaut or Brutus) but due to multiple bands having the same name, know as <>-<> so psychonaut-be (https://genius.com/artists/Psychonaut-be) and brutus-be https://genius.com/artists/brutus-be

Running this command in verbose (-vv) mode:

$ beet -vv lyrics violate consensus reality all your gods have gone

Led to this problem:

user configuration: /Media/home/.config/beets/config.yaml
data directory: /Media/home/.config/beets
plugin paths: 
Sending event: pluginload
library database: /Media/home/.config/beets/library.db
library directory: /Media/Music
Sending event: library_opened
lyrics: Genius failed to find a matching artist for 'Psychonaut'
lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Psychonaut/All-Your-Gods-Have-Gone (404)
lyrics: lyrics not found: Psychonaut - Violate Consensus Reality - All Your Gods Have Gone
Sending event: cli_exit

Here's a link to the music files that trigger the bug (if relevant):

Setup

My configuration (output of beet config) is:

lyrics:
    bing_lang_from: []
    google_API_key: REDACTED
    google_engine_ID: REDACTED
    fallback: ''
    sources: genius musixmatch
    auto: yes
    bing_client_secret: REDACTED
    bing_lang_to:
    genius_api_key: REDACTED
    force: no
    local: no
directory: /Media/Music
library: /Media/home/.config/beets/library.db

import:
    copy: no
    write: yes
ignore: ['?eaDir*']
incremental: yes
genres: yes

ui:
    color: yes

paths:
    default: $albumartist/$albumartist - $year - $album/$albumartist - $album - $track - $title

plugins: web discogs fetchart mbsync duplicates info missing lyrics
web:
    host: 0.0.0.0
    readonly: no
    include_paths: yes
    port: 8337
    cors: ''
    cors_supports_credentials: no
    reverse_proxy: no
fetchart:
    auto: yes
    cover_names: cover front art album folder
    sources: coverart itunes amazon albumart
    minwidth: 0
    maxwidth: 0
    quality: 0
    max_filesize: 0
    enforce_ratio: no
    cautious: no
    google_key: REDACTED
    google_engine: 001442825323518660753:hrh5ch1gjzm
    fanarttv_key: REDACTED
    lastfm_key: REDACTED
    store_source: no
    high_resolution: no
    deinterlace: no
    cover_format:
discogs:
    index_tracks: yes
    apikey: REDACTED
    apisecret: REDACTED
    tokenfile: discogs_token.json
    source_weight: 0.5
    user_token: REDACTED
    separator: ', '
missing:
    count: no
    total: no
    album: no
duplicates:
    album: no
    checksum: ''
    copy: ''
    count: no
    delete: no
    format: ''
    full: no
    keys: []
    merge: no
    move: ''
    path: no
    tiebreak: {}
    strict: no
    tag: ''
sampsyo commented 1 year ago

This sounds annoying! It would be helpful to experiment with different ways of resolving the ambiguity. For example, does simply dropping the last two-letter word always work, or does that ever introduce ambiguity with a different artist?

Here's where to start when tweaking the matching heuristic: https://github.com/beetbox/beets/blob/9527a07767629c1ceb99c2cd681b78172a7272a0/beetsplug/lyrics.py#L361

ronajon commented 1 year ago

if i regex replace [<2 letter country code>] it seems to work line 359

old

hit_artist = hit["result"]["primary_artist"]["name"]

new

hit_artist = re.sub(r'.[\(\[]..[\)\]]','',hit["result"]["primary_artist"]["name"]) 
sampsyo commented 1 year ago

Nice, that seems like a good step! An eventual PR should try both (the original and truncated name, if any) to make sure we don't miss artists that happen to look like this.