beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.81k stars 1.82k forks source link

Incorrect song matching #2711

Closed mgray88 closed 4 years ago

mgray88 commented 6 years ago

Problem

Running beet import it wasn't able to identify the album name from the folder name for search. So I entered the MusicBrainz id that I wanted to match.

Led to this problem:

That worked, and matched correctly with the first two songs as can be seen; however, it incorrectly tried to match the third song with a different song

screenshot from 2017-10-14 23-58-07 screenshot from 2017-10-14 23-58-20

Setup

My configuration (output of beet config) is:

directory: /mnt/Media/Music
import:
    move: yes
plugins: discogs
discogs:
    user_token: ***
paths:
    singleton: $albumartist/$album%aunique{}/$track $title
match:
    preferred:
        countries: ['US', 'GB|UK']
        media: ['CD', 'Digital Media|File']
sampsyo commented 6 years ago

Interesting! Our matching heuristic isn't perfect, of course. If you're interested in digging into exactly why this didn't match the way you expected, please provide a test case we can reproduce.

jackwilsdon commented 6 years ago

This sounds like we should maybe consider non-alphanumeric changes as being "closer" than changing characters, maybe weighting them differently? (I'm not quite sure how our distance calculations work now though).

If anyone is interested in taking a look, the area of interest seems to be the string_dist function (although correct me if I'm wrong @sampsyo).

sampsyo commented 6 years ago

Yes, that’s the right function to look at! I don’t quite understand the hypothesis, though, @jackwilsdon—which non-alphanumeric characters are you referring to?

jackwilsdon commented 6 years ago

It seems to have ranked "Ride the River" higher than "My Father's Eyes" when searching for "My Fathers Eyes" :confused:

sampsyo commented 6 years ago

Got it! So the hypothesis is that the apostrophe is counting for more than the letters. I can’t think of anything specific that would cause this, but it’s worth investigating.

(Sorry for the close/reopen noise; I hit the wrong button. 🙄)

jackwilsdon commented 6 years ago

Just did a little bit of testing and got these results;

str1 str2 string_dist
My Fathers Eyes Ride the River 0.6153846153846154
My Fathers Eyes My Father's Eyes 0

So I guess that proves my hypothesis wrong!

sampsyo commented 6 years ago

Hmm... thank you for investigating! This is certainly mysterious.

nbartnik commented 6 years ago

Hi,

This issue occurs when the following conditions are met: 1) Track title alpha-numeric sort order differs from the album track list order.
2) Track number is missing from the tags

e.g

Poor matching with only track title, album title and artist:

Tagging:
    Jonathan Coulton - Artificial Heart
URL:
    https://musicbrainz.org/release/6f238401-4ba3-4691-8c62-02242627a28c
(Similarity: 48.6%) (tracks) (CD, 2011, US, Jocoserious Records)
 * Down Today.mp3 (#0)               -> Sticking It to Myself (#1) (title)
 * Artificial Heart.mp3 (#0)         -> Artificial Heart (#2) (title)
 * Dissolve.mp3 (#0)                 -> Nemeses (#3) (title)
 * Alone at Home.mp3 (#0)            -> The World Belongs to You (#4) (title)
 * Fraud.mp3 (#0)                    -> Today With Your Wife (#5) (title)
 * Sucker Punch.mp3 (#0)             -> Sucker Punch (#6) (title)
 * Glasses.mp3 (#0)                  -> Glasses (#7) (title)
 * Good Morning Tucson.mp3 (#0)      -> Je suis Rick Springfield (#8) (title)
 * The World Belongs to You.mp3 (#0) -> Alone at Home (#9) (title)
 * Nemeses.mp3 (#0)                  -> Fraud (#10) (title)
 * Je suis Rick Springfield.mp3 (#0) -> Good Morning Tucson (#11) (title)
 * Now I Am an Arsonist.mp3 (#0)     -> Now I Am an Arsonist (#12) (title)
 * Nobody Loves You Like Me.mp3 (#0) -> Down Today (#13) (title)
 * The Stache.mp3 (#0)               -> Dissolve (#14) (title)
 * Sticking It to Myself.mp3 (#0)    -> Nobody Loves You Like Me (#15) (title)
 * Still Alive.mp3 (#0)              -> Still Alive (#16) (title)
 * Want You Gone.mp3 (#0)            -> Want You Gone (#17) (title)
 * Today With Your Wife.mp3 (#0)     -> The Stache (#18) (title)

Same album, but with track name sort order matching the album tracklist order:

Tagging:
    Jonathan Coulton - Artificial Heart
URL:
    https://musicbrainz.org/release/6f238401-4ba3-4691-8c62-02242627a28c
(Similarity: 92.2%) (tracks) (CD, 2011, US, Jocoserious Records)
 * aa Sticking It to Myself (#0)    -> Sticking It to Myself (#1) (title)
 * ab Artificial Heart (#0)         -> Artificial Heart (#2) (title)
 * ac Nemeses (#0)                  -> Nemeses (#3) (title)
 * ad The World Belongs to You (#0) -> The World Belongs to You (#4) (title)
 * ae Today With Your Wife (#0)     -> Today With Your Wife (#5) (title)
 * af Sucker Punch (#0)             -> Sucker Punch (#6) (title)
 * ag Glasses (#0)                  -> Glasses (#7) (title)
 * ah Je suis Rick Springfield (#0) -> Je suis Rick Springfield (#8) (title)
 * ai Alone at Home (#0)            -> Alone at Home (#9) (title)
 * aj Fraud (#0)                    -> Fraud (#10) (title)
 * ak Good Morning Tucson (#0)      -> Good Morning Tucson (#11) (title)
 * al Now I Am an Arsonist (#0)     -> Now I Am an Arsonist (#12) (title)
 * am Down Today (#0)               -> Down Today (#13) (title)
 * an Dissolve (#0)                 -> Dissolve (#14) (title)
 * ao Nobody Loves You Like Me (#0) -> Nobody Loves You Like Me (#15) (title)
 * ap Still Alive (#0)              -> Still Alive (#16) (title)
 * aq Want You Gone (#0)            -> Want You Gone (#17) (title)
 * ar The Stache (#0)               -> The Stache (#18) (title)
sampsyo commented 6 years ago

Hi! I think the problem you're seeing here, @nbartnik, is that the first version has no title tags. You're seeing the filenames as a fallback. Please consider enabling the fromfilename plugin.

stale[bot] commented 4 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

MCMXD commented 1 week ago

Are there any add ons to correct this sort of error via CLI during import?