Closed ghost closed 4 years ago
True! It uses exact matches, with a few heuristic hacks. You can see an existing hack there that tries to normalize quote characters, for example: https://github.com/beetbox/beets/blob/8cfbc8274ee3b6ea42433a50becc19f1324f5cef/beetsplug/lastimport.py#L221-L229
You can be infinitely creative about how to cope with parentheses.
I'm thinking of contributing to the plugin to add this feature. So far I'm trying to have a function (either written by me or from another open source project that already dealt with this) detect featured artists only from the title string, create variations of the title (e.g. ['Touch', 'Touch feat. Paul Williams', 'Touch (ft. Paul Williams)']
), then try querying the library by looping over those variations to cover all bases.
Am I missing something about the last.fm API that could be used to avoid this route? I assumed there isn't otherwise it would have been implemented since the beginning.
Yep, I think you’re on the right track doing it this way! Thanks for looking into it.
I was considering using feat_tokens
from plugins.py, but quickly realized that it is insufficient for this task:
from beets import plugins
feat_tokens = plugins.feat_tokens()
titles = [
'Sucker For Pain (with Wiz Khalifa, Imagine Dragons, Logic & Ty Dolla $ign feat. X Ambassadors)',
'Ticker Tape (feat. Carly Simon & Kali Uchis)',
'Lose Yourself to Dance (feat. Pharrell Williams)',
'She’s My Collar (Feat. Kali Uchis)', 'SIRENS | Z1RENZ [FEAT. J.I.D | J.1.D]',
'She Wolf (Falling to Pieces) [feat. Sia]',
'Love’s Vagrant (Ringabell) [ft. Ralfington]'
]
for title in titles:
print(title)
print(re.findall(feat_tokens, title))
returns
Sucker For Pain (with Wiz Khalifa, Imagine Dragons, Logic & Ty Dolla $ign feat. X Ambassadors)
['&', 'feat.']
Ticker Tape (feat. Carly Simon & Kali Uchis)
['&']
Lose Yourself to Dance (feat. Pharrell Williams)
[]
She’s My Collar (Feat. Kali Uchis)
[]
SIRENS | Z1RENZ [FEAT. J.I.D | J.1.D]
[]
She Wolf (Falling to Pieces) [feat. Sia]
[]
Love’s Vagrant (Ringabell) [ft. Ralfington]
[]
It does not check for parentheses, brackets, capitalization, etc.
But I'm not sure if I should work on the function to expand its matching ability. Such a regex string would be monstrous one that can find for example: 'ft.', 'Ft.', 'FT.', 'feat.', 'Feat.', 'FEAT.', 'f/', 'F/', 'f.', 'F.', 'featuring', 'Featuring', 'FEATURING', 'with', 'With', 'WITH', 'vs', 'vs.', 'VS', 'VS.', 'Vs.', 'Vs', 'and', 'And', 'AND', 'con', 'Con', 'CON', '&', etc.
I'm worried that such a change would cause other plugins problems and matching too much. On the other hand, I'm thinking that perhaps it might benefit other plugins using the function to increase their accuracy.
I'm thinking I have 3 main options:
feat_tokens
feat_tokens
(perhaps expanded_feat_tokens
or something) in plugins.pyWhat do you think?
Hmm… it seems like the patterns you're looking for are pretty idiosyncratic to Last.fm's chaotic naming conventions. So maybe special-purpose logic just for the plugin is in order?
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Setup
My configuration (output of
beet config
) is:This plugin is great, however, I noticed that some of my library tracks did not get matched with last.fm data due to minor differences.
For example, in Random Access Memories by Daft Punk: "Touch (feat. Paul Williams)" - Last.fm "Touch" - My library
Or even worse, from M.A.A.D City: "Compton (feat. Dr. Dre)" - Last.fm "Compton feat. Dr. Dre" - My library (following musicbrainz style guidelines)
I would try to fix it myself but I want to make sure I'm not missing something first.