beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.89k stars 1.82k forks source link

Incremental option not working #4315

Closed a0g83agbc84 closed 2 years ago

a0g83agbc84 commented 2 years ago

Problem

Running import on the same directory with incremental: yes keeps adding already added songs, creating thousands of duplicates that need to be removed afterwards.

Command:

$ beet import /mnt/tmp/music

Setup

My configuration (output of beet config) is:

directory: /mnt/music
library: ~/beet/library.db
asciify_paths: yes
threaded: yes

import:
    copy: yes
    autotag: no
    write: no
    incremental: yes
    group_albums: yes
    duplicate_action: ask
    log: ~/beet/import.log

paths:
    default: $albumartist/$album%aunique{}/$track - $title
    singleton: Non-Album/$artist/$title
    comp: Compilations/$album%aunique{}/$track - $title

plugins: duplicates

duplicates:
    album: no
    checksum: ''
    copy: ''
    count: no
    delete: no
    format: ''
    full: no
    keys: []
    merge: no
    move: ''
    path: no
    tiebreak: {}
    strict: no
    tag: ''
jackwilsdon commented 2 years ago

Can you provide the output from a verbose import (running with -vv)?

a0g83agbc84 commented 2 years ago

@jackwilsdon Thanks for the super quick reply. I created a new directory with just a few songs to avoid importing everything again. Here's the output:

music@debian:~$ beet -vv import /mnt/tmp/music_small/
user configuration: /home/music/.config/beets/config.yaml
data directory: /home/music/.config/beets
plugin paths:
Sending event: pluginload
library database: /home/music/beet/library.db
library directory: /mnt/music
Sending event: library_opened
Sending event: import_begin
Import of the directory:
/mnt/tmp/music_small
was interrupted. Resume (Y/n)? y
Resuming interrupted import of /mnt/tmp/music_small
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
/mnt/tmp/music_small/Eminem - Ass Like That.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dido - Stan.mp3; /mnt/tmp/music_small/Eminem  Nate Dogg - Shake That.mp3; /mnt/tmp/music_small/Eminem - Just Lose It.mp3; /mnt/tmp/music_small/Eminem - Lose Yourself From 8 Mile Soundtrack.mp3; /mnt/tmp/music_small/Eminem - Mockingbird.mp3; /mnt/tmp/music_small/Eminem - The Real Slim Shady.mp3; /mnt/tmp/music_small/Eminem - When Im Gone.mp3
0 of 7 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem - Venom Music From The Motion Picture.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Juice Wrld - Godzilla.mp3; /mnt/tmp/music_small/Eminem & Juice Wrld - Godzilla.mp3
0 of 2 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem - Rap God.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Rihanna - Love The Way You Lie.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dr Dre  50 Cent - Crack A Bottle.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dina Rae - Superman.mp3; /mnt/tmp/music_small/Eminem - Sing For The Moment.mp3; /mnt/tmp/music_small/Eminem - Without Me.mp3
0 of 3 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dr Dre  Snoop Dogg  Xzibit  Nate Dogg - Bitch Please II.mp3; /mnt/tmp/music_small/Eminem - Kill You.mp3
0 of 2 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: import
Sending event: cli_exit

And this is how the directory after the import looks like:

$ ls /mnt/music/Eminem/ -l
total 0
'Ass Like That'
'Ass Like That [1130]'
'Curtain Call_ The Hits'
'Curtain Call_ The Hits [1131]'
 Kamikaze
'Kamikaze [1132]'
'Music To Be Murdered By'
'Music To Be Murdered By [1133]'
'Rap God'
'Rap God [1134]'
 Recovery
'Recovery [1135]'
 Relapse
'Relapse [1136]'
'Relapse_ Refill'
'The Eminem Show'
'The Eminem Show [1137]'
'The Marshall Mathers LP'
'The Marshall Mathers LP [1138]'

Extra output of duplicates plugin:

$ beet duplicates -k title -k albumartist -k album
Eminem - Ass Like That - Ass Like That
Eminem - Curtain Call: The Hits - Lose Yourself (From "8 Mile" Soundtrack)
Eminem - Curtain Call: The Hits - The Real Slim Shady
Eminem - Curtain Call: The Hits - Mockingbird
Eminem - Curtain Call: The Hits - Just Lose It
Eminem - Curtain Call: The Hits - When I'm Gone
Eminem - Kamikaze - Venom (Music From The Motion Picture)
Eminem - Rap God - Rap God
Eminem - The Eminem Show - Without Me
Eminem - The Eminem Show - Sing For The Moment
Eminem - The Marshall Mathers LP - Kill You
Eminem & Dido - Curtain Call: The Hits - Stan
Eminem & Dina Rae - The Eminem Show - Superman
Eminem & Dr. Dre & 50 Cent - Relapse - Crack A Bottle
Eminem & Dr. Dre & Snoop Dogg & Xzibit & Nate Dogg - The Marshall Mathers LP - Bitch Please II
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Nate Dogg - Curtain Call: The Hits - Shake That
Eminem & Rihanna - Recovery - Love The Way You Lie

Let me know if you'd need anything else :)

sampsyo commented 2 years ago

Thanks for the output! It's a little hard to see exactly what's going on here, unfortunately… to satisfy our curiosity, can you try running exactly the same command twice in a row (i.e., beet -vv import /mnt/tmp/music_small/) and seeing if anything changes?

a0g83agbc84 commented 2 years ago

The import output looks exactly the same I think. You can see how with ls and duplicates there are two duplicates. Could it be related to group_albums: yes? I feel like it could be due to different processing with that flag as I have a huge dir /mnt/tmp/music where I dump all the music in mp3, with no directories whatsoever.

Import:

$ beet -vv import /mnt/tmp/music_small/
user configuration: /home/music/.config/beets/config.yaml
data directory: /home/music/.config/beets
plugin paths:
Sending event: pluginload
library database: /home/music/beet/library.db
library directory: /mnt/music
Sending event: library_opened
Sending event: import_begin
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
Sending event: import_task_created
/mnt/tmp/music_small/Eminem - Ass Like That.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dido - Stan.mp3; /mnt/tmp/music_small/Eminem  Nate Dogg - Shake That.mp3; /mnt/tmp/music_small/Eminem - Just Lose It.mp3; /mnt/tmp/music_small/Eminem - Lose Yourself From 8 Mile Soundtrack.mp3; /mnt/tmp/music_small/Eminem - Mockingbird.mp3; /mnt/tmp/music_small/Eminem - The Real Slim Shady.mp3; /mnt/tmp/music_small/Eminem - When Im Gone.mp3
0 of 7 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem - Venom Music From The Motion Picture.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Juice Wrld - Godzilla.mp3; /mnt/tmp/music_small/Eminem & Juice Wrld - Godzilla.mp3
0 of 2 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem - Rap God.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Rihanna - Love The Way You Lie.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dr Dre  50 Cent - Crack A Bottle.mp3
0 of 1 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dina Rae - Superman.mp3; /mnt/tmp/music_small/Eminem - Sing For The Moment.mp3; /mnt/tmp/music_small/Eminem - Without Me.mp3
0 of 3 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
/mnt/tmp/music_small/Eminem  Dr Dre  Snoop Dogg  Xzibit  Nate Dogg - Bitch Please II.mp3; /mnt/tmp/music_small/Eminem - Kill You.mp3
0 of 2 items replaced
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: item_copied
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: database_change
Sending event: import_task_files
Sending event: album_imported
Sending event: import
Sending event: cli_exit

ls:

$ ls /mnt/music/Eminem/ -l
total 0
'Ass Like That'
'Ass Like That [1130]'
'Ass Like That [1139]'
'Curtain Call_ The Hits'
'Curtain Call_ The Hits [1131]'
'Curtain Call_ The Hits [1140]'
 Kamikaze
'Kamikaze [1132]'
'Kamikaze [1141]'
'Music To Be Murdered By'
'Music To Be Murdered By [1133]'
'Music To Be Murdered By [1142]'
'Rap God'
'Rap God [1134]'
'Rap God [1143]'
 Recovery
'Recovery [1135]'
'Recovery [1144]'
 Relapse
'Relapse [1136]'
'Relapse [1145]'
'Relapse_ Refill'
'The Eminem Show'
'The Eminem Show [1137]'
'The Eminem Show [1146]'
'The Marshall Mathers LP'
'The Marshall Mathers LP [1138]'
'The Marshall Mathers LP [1147]'

duplicates:

$ beet duplicates -k title -k albumartist -k album
Eminem - Ass Like That - Ass Like That
Eminem - Ass Like That - Ass Like That
Eminem - Curtain Call: The Hits - Lose Yourself (From "8 Mile" Soundtrack)
Eminem - Curtain Call: The Hits - Lose Yourself (From "8 Mile" Soundtrack)
Eminem - Curtain Call: The Hits - The Real Slim Shady
Eminem - Curtain Call: The Hits - The Real Slim Shady
Eminem - Curtain Call: The Hits - Mockingbird
Eminem - Curtain Call: The Hits - Mockingbird
Eminem - Curtain Call: The Hits - Just Lose It
Eminem - Curtain Call: The Hits - Just Lose It
Eminem - Curtain Call: The Hits - When I'm Gone
Eminem - Curtain Call: The Hits - When I'm Gone
Eminem - Kamikaze - Venom (Music From The Motion Picture)
Eminem - Kamikaze - Venom (Music From The Motion Picture)
Eminem - Rap God - Rap God
Eminem - Rap God - Rap God
Eminem - The Eminem Show - Without Me
Eminem - The Eminem Show - Without Me
Eminem - The Eminem Show - Sing For The Moment
Eminem - The Eminem Show - Sing For The Moment
Eminem - The Marshall Mathers LP - Kill You
Eminem - The Marshall Mathers LP - Kill You
Eminem & Dido - Curtain Call: The Hits - Stan
Eminem & Dido - Curtain Call: The Hits - Stan
Eminem & Dina Rae - The Eminem Show - Superman
Eminem & Dina Rae - The Eminem Show - Superman
Eminem & Dr. Dre & 50 Cent - Relapse - Crack A Bottle
Eminem & Dr. Dre & 50 Cent - Relapse - Crack A Bottle
Eminem & Dr. Dre & Snoop Dogg & Xzibit & Nate Dogg - The Marshall Mathers LP - Bitch Please II
Eminem & Dr. Dre & Snoop Dogg & Xzibit & Nate Dogg - The Marshall Mathers LP - Bitch Please II
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Juice Wrld - Music To Be Murdered By - Godzilla
Eminem & Nate Dogg - Curtain Call: The Hits - Shake That
Eminem & Nate Dogg - Curtain Call: The Hits - Shake That
Eminem & Rihanna - Recovery - Love The Way You Lie
Eminem & Rihanna - Recovery - Love The Way You Lie
sampsyo commented 2 years ago

Aha, yes, absolutely! Sorry I didn't notice the group_albums option set before. See #1476 for further discussion: incremental mode relies on recording the directory associated with every album, and in group_albums mode, we don't have a unique per-album directory to record.

a0g83agbc84 commented 2 years ago

Oops, completely missed that open issue. So, my guess is that nothing has changed since 2015? 😛 Is there any workaround for that? If only solution is code wise, I guess I could try and help with that, but I'd appreciate if I could get a starting point on where to start.

sampsyo commented 2 years ago

Unfortunately, the path to a fix is not very clear. Since we use directory paths to keep track of what has been imported, and this option means that albums can be made up of files from many different directories, it's not all that obvious what we should even do.

Maybe we can use the set of file paths instead in this mode? That's somewhat brittle, since even a single new or missing file will mean that the recorded information doesn't match. We could use the artist and album name, but that's also a bit weird because multiple albums on disk could share those. It will take some creativity!

a0g83agbc84 commented 2 years ago

I'm confused. I probably didn't understand your explanation properly so let me go with a proposal and you can let me know how you see it.

What about just recording in a file the full path of all the songs (.mp3, .flac, .wav...) that have been imported, no matter where they ended up being copied / moved at? That's solely the user's decision and can be configured with different paths so beets should have nothing to do with that. Also, the renaming is based on tags, so it's pretty much impossible to rely on that as some songs will include great tags, others not reliable ones and others directly won't have any.

And before renaming, if a file already exists (pretty much same metadata) then beets uses the duplicate_action config.

When importing, I would say the most important thing is: if song located at /mnt/tmp/music/song_01.mp3 has already been dealt with, just skip it. If it happens to be that /mnt/tmp/music/song_01.mp3 has already been moved in a previous import action but the same song has been found in /mnt/tmp/music/new_album/song_01.mp3, then attempt to move it. If it has same metadata, beets will rename it in the same way as the one previously added so it's a duplicate (without getting into different bitrate and so on) and then let duplicate_action trigger.

If the import action is move then the track file is not even needed as even if a song has the same name as one previously added song_01.mp3 they could still be two completely different ones.

This is similar to how Filebot works. Keeps a record of the source path of the file.

What do you think? This is the most simple way of keeping track of this. If it's not implemented I guess it's because it has some caveats, but please let me know :)

sampsyo commented 2 years ago

I see what you're saying! The tricky bit, in terms of actual implementation, is that this will be significantly different from how the incremental option works when group_albums is off. In that setting, we don't keep track of individual files that have been imported (that wouldn't make sense, really); we keep track of the entire directories containing albums that have been imported. Doing file-level incremental tracking for group_albums would require doing it at a different stage in the pipeline, i.e., before we group files together into albums.

Anyway, seems like a very reasonable thing to do, just wanted to point out that it won't exactly be a simple evolution of what's currently there!

stale[bot] commented 2 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.