Open HackingPheasant opened 4 years ago
Hello! I think it’s likely that this is happening because you’re importing from zip files. Any chance you could verify that’s what’s going wrong by importing some non-compressed directories and seeing whether the same problem happens?
Otherwise, any chance you could blame a plugin (by selectively disabling them)?
It's defiantly happening because its importing from zip files, I have no problem with it unzipping to /tmp but there has go to be some checks so you don't unzip more into /tmp then /tmp can handle.
As for importing a large (as in size) directory of unzipped files, I can try that within a few days.
Cool! Just checking, though: why do you say that it's definitely because of unzipping?
And just to follow up, beets is supposed to delete these directories. And indeed, in your verbose log, you can see:
Removing extracted directory: /tmp/tmp14s0bvtk
Can you confirm that this directory is not actually deleted?
An ideal solution would be to import straight out of the zip file, but looking at our current import pipeline it seems like it's not really feasible (it seems quite filesystem-oriented).
Because from my past experience each time beets needs to unzip something it unzips to /tmp as well as (as you should know) as it does a few albums ahead at a time for a better user experiences, usually this combination wouldn't be a problem but the zips I am importing have files at between 450mbs to 1gig in size, so just unzipping a handful at once easily easily files up /tmp
I am now importing the directy in batchs of 10, and I came accross a few entry's (in beets db) that where still at the /tmp unzip location, below is an example of what i just encountered and described.
$ beet import \#17*
/tmp/tmp4zvxh53y/#172 - Monstercat Call of the Wild (NGHTMRE & Slander Takeover) (2 items)
Correcting tags from:
Monstercat - #172 - Monstercat: Call of the Wild (NGHTMRE & Slander Takeover)
To:
Monstercat - 2017-10-10: #172 – Monstercat: Call of the Wild (NGHTMRE & SLANDER Takeover)
URL:
https://musicbrainz.org/release/be90844b-7fba-4b4c-a520-d6132705f1ff
(Similarity: 87.9%) (media, album, tracks) (Digital Media, 2017, XW, Monstercat, COTW172, Monstercat.com version including "music only" track)
* #172 - Monstercat: Call of the Wild (NGHTMRE & Slander Takeover) ->
2017-10-10: #172 – Monstercat: Call of the Wild (NGHTMRE & SLANDER Takeover) (title)
* #172 - Monstercat: Call of the Wild (NGHTMRE & Slander Takeover) (Music Only) ->
2017-10-10: #172 – Monstercat: Call of the Wild (NGHTMRE & SLANDER Takeover) (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
This album is already in the library!
Old: 2 items, FLAC, 1099kbps, 123:56, 976.4 MiB
New: 2 items, FLAC, 1099kbps, 123:56, 976.4 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmppq2q8xwv (2 items)
Correcting tags from:
Monstercat - #173 - Monstercat: Call of the Wild
To:
Monstercat - 2017-10-17: #173 – Monstercat: Call of the Wild
URL:
https://musicbrainz.org/release/4f32b305-9319-45c1-89a3-9cb3484c3991
(Similarity: 80.8%) (tracks, album, media) (Digital Media, 2017, XW, Monstercat, COTW173, Monstercat.com version including "music only" track)
* Monstercat: Call of the Wild EP. 173 -> 2017-10-17: #173 – Monstercat: Call of the Wild (title)
* Monstercat: Call of the Wild EP. 173 (Music Only) -> 2017-10-17: #173 – Monstercat: Call of the Wild (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
This album is already in the library!
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpitnd9klt/Monstercat - 1 Monstercat: Call of the Wild EP. 173.flac'
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpitnd9klt/Monstercat - 2 Monstercat: Call of the Wild EP. 173 (Music Only).flac'
Old: 2 items, FLAC, 1052kbps, 120:32, 0.0 B
New: 2 items, FLAC, 1052kbps, 120:32, 911.6 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmpc_37sw6d/#174 - Monstercat Call of the Wild (Notaker Takeover) (2 items)
Correcting tags from:
Monstercat - #174 - Monstercat: Call of the Wild (Notaker Takeover)
To:
Monstercat - 2017-10-24: #174 – Monstercat: Call of the Wild (Notaker Takeover)
URL:
https://musicbrainz.org/release/29e56bdc-2a39-4041-bc20-2591c4287ce8
(Similarity: 86.7%) (media, album, tracks) (Digital Media, 2017, XW, Monstercat, COTW174, Monstercat.com version including "music only" track)
* #174 - Monstercat: Call of the Wild (Notaker Takeover) ->
2017-10-24: #174 – Monstercat: Call of the Wild (Notaker Takeover) (title)
* #174 - Monstercat: Call of the Wild (Notaker Takeover) (Music Only) ->
2017-10-24: #174 – Monstercat: Call of the Wild (Notaker Takeover) (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
This album is already in the library!
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpuw5cpqt4/#174 - Monstercat Call of the Wild (Notaker Takeover)/Monstercat - #174 - Monstercat Call of the Wild (Notaker Takeover) - 1 #174 - Monstercat Call of the Wild (Notaker Takeover).flac'
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpuw5cpqt4/#174 - Monstercat Call of the Wild (Notaker Takeover)/Monstercat - #174 - Monstercat Call of the Wild (Notaker Takeover) - 2 #174 - Monstercat Call of the Wild (Notaker Takeover) (Music Only).flac'
Old: 2 items, FLAC, 1039kbps, 125:21, 0.0 B
New: 2 items, FLAC, 1039kbps, 125:21, 934.2 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmpa_1iei93/#175 - Monstercat Call of the Wild (Halloween Special) (1 items)
Correcting tags from:
Monstercat - #175 - Monstercat: Call of the Wild (Halloween Special)
To:
Monstercat - 2017-10-31: #175 – Monstercat: Call of the Wild (Halloween Special)
URL:
https://musicbrainz.org/release/e4843a93-2ea2-4b21-9e29-012aae0eaaba
(Similarity: 85.5%) (media, album, tracks) (Digital Media, 2017, XW, Monstercat, COTW175)
* #175 - Monstercat: Call of the Wild (Halloween Special) -> 2017-10-31: #175 – Monstercat: Call of the Wild (Halloween Special) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY? A
This album is already in the library!
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpchz4ixpt/#175 - Monstercat Call of the Wild (Halloween Special)/Monstercat - #175 - Monstercat Call of the Wild (Halloween Special) - 1 #175 - Monstercat Call of the Wild (Halloween Special).flac'
Old: 1 items, FLAC, 1087kbps, 62:00, 0.0 B
New: 1 items, FLAC, 1087kbps, 62:00, 482.9 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmpheij3mmz/#177 - Monstercat Call of the Wild (2 items)
Correcting tags from:
Monstercat - #177 - Monstercat: Call of the Wild
To:
Monstercat - 2017-11-21: #177 – Monstercat: Call of the Wild
URL:
https://musicbrainz.org/release/3bdde99f-0e97-451c-9d6c-526ba68ed37b
(Similarity: 83.7%) (album, media, tracks) (Digital Media, 2017, XW, Monstercat, COTW177, Monstercat.com version including "music only" track)
* #177 - Monstercat: Call of the Wild -> 2017-11-21: #177 – Monstercat: Call of the Wild (title)
* #177 - Monstercat: Call of the Wild (Music Only) -> 2017-11-21: #177 – Monstercat: Call of the Wild (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
This album is already in the library!
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpayea2hf1/#177 - Monstercat Call of the Wild/Monstercat - #177 - Monstercat Call of the Wild - 1 #177 - Monstercat Call of the Wild.flac'
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpayea2hf1/#177 - Monstercat Call of the Wild/Monstercat - #177 - Monstercat Call of the Wild - 2 #177 - Monstercat Call of the Wild (Music Only).flac'
Old: 2 items, FLAC, 1107kbps, 117:45, 0.0 B
New: 2 items, FLAC, 1107kbps, 117:45, 934.6 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmp66bi7kwd/#178 - Monstercat Call of the Wild (2 items)
Correcting tags from:
Monstercat - #178 - Monstercat: Call of the Wild
To:
Monstercat - 2017-11-28: #178 – Monstercat: Call of the Wild
URL:
https://musicbrainz.org/release/bcd85a08-3c56-411d-96c2-b143cd2b71fe
(Similarity: 83.7%) (album, media, tracks) (Digital Media, 2017, XW, Monstercat, COTW178, Monstercat.com version including "music only" track)
* #178 - Monstercat: Call of the Wild -> 2017-11-28: #178 – Monstercat: Call of the Wild (title)
* #178 - Monstercat: Call of the Wild (Music Only) -> 2017-11-28: #178 – Monstercat: Call of the Wild (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
This album is already in the library!
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpn8vjhrcr/#178 - Monstercat Call of the Wild/Monstercat - #178 - Monstercat Call of the Wild - 1 #178 - Monstercat Call of the Wild.flac'
could not get filesize: [Errno 2] No such file or directory: b'/tmp/tmpn8vjhrcr/#178 - Monstercat Call of the Wild/Monstercat - #178 - Monstercat Call of the Wild - 2 #178 - Monstercat Call of the Wild (Music Only).flac'
Old: 2 items, FLAC, 1009kbps, 125:04, 0.0 B
New: 2 items, FLAC, 1009kbps, 125:04, 904.5 MiB
[S]kip new, Keep both, Remove old, Merge all? R
/tmp/tmpzglndop9/#179 - Monstercat Call of the Wild (2 items)
Correcting tags from:
Monstercat - #179 - Monstercat: Call of the Wild
To:
Monstercat - 2017-12-05: #179 – Monstercat: Call of the Wild
URL:
https://musicbrainz.org/release/eea38b67-2203-4386-91be-4e856dfc5aec
(Similarity: 83.7%) (album, media, tracks) (Digital Media, 2017, XW, Monstercat, COTW179, Monstercat.com version including "music only" track)
* #179 - Monstercat: Call of the Wild -> 2017-12-05: #179 – Monstercat: Call of the Wild (title)
* #179 - Monstercat: Call of the Wild (Music Only) -> 2017-12-05: #179 – Monstercat: Call of the Wild (music only) (title)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY?
Can you confirm that this directory is not actually deleted? Yes the files where still there even after the subsequent run of beets, as I had to beets a second time with -vv as shown in my initial report, it didn't clean up the mess from the previous run of beets, and then choked because /tmp was fill still
Edit: I cant check the exact directory you want because my laptop had restarted (flat battery), but I did visually go check my bigger screenshot of the /tmp directory that I didn't include here in the inital and that folder was indeed removed
Aha, I see—so the problem is not that the directories are not going to be eventually removed; it's that the unzipping is "running ahead" far enough that the in-flight directories get too big for the /tmp volume.
This does indeed seem like a problem! But TBH, I'm not entirely sure what the right resolution is. We could add logic to check /tmp's free space and "block" the unzipping process—but this has several drawbacks:
Anybody have any bright ideas? I suppose we could look into "direct" importing from zip files, as @jackwilsdon mentioned and I mentioned long ago in https://github.com/beetbox/beets/pull/690#issuecomment-40507410, which would be awesome but difficult to implement.
Anyways, my above is the opinion on whats happening, but you guys know beets better then I ;)
And just to follow up, beets is supposed to delete these directories. And indeed, in your verbose log, you can see:
Yeah it does remove them, but the queueing of the other imports in the background is what screws this up
It's hard (impossible?) to predict how big the expanded files will be before expanding them.
Can you read the header of compressed files to get this kind of info?
Can you read the header of compressed files to get this kind of info?
I think so - ZipFile provides the uncompressed size of each file so we could sum that up, although that still leaves the other issues @sampsyo mentioned above.
It's better than nothing! The header contains self-reported sizes, but they can be wrong—a maliciously or buggily constructed zip file can, for example, report a size that is much smaller than the actual decompressed size.
For the mean time, at least documenting this in FAQs may be good enough for the time being, it isn't the biggest problem overall, not too many people are importing tons of large zips files at once (thus why its taken this long before the issue was even noticed)
I'd also like to note, if you import a large album and decide to skip it, it isn't removed from /tmp so another thing to also keep in mind
Hmm, that definitely seems more addressable! Is this reproducible if you just import one album and skip it? If so, any chance we could trouble you for a verbose log of that action so we can see where things "bailed out"?
Hmm as I was getting the verbose logs as requested. I realised the mistake in my last message (it was late at night, sorry). Skipping does cleanup after itself, its abort that does not. I'll attach the verbose logs for both just for reference
/tmp
Before:
The size of the file I am importing
$ du -sh Monstercat\ Uncaged\ Vol.\ 5\ \(FLAC\).zip
2.7G Monstercat Uncaged Vol. 5 (FLAC).zip
/tmp After use the abort option:
Abort log (Doesn't cleanup /tmp):
$ beet -vv import -t Monstercat\ Uncaged\ Vol.\ 5\ \(FLAC\).zip
user configuration: /home/matt/.config/beets/config.yaml
data directory: /home/matt/.config/beets
plugin paths:
Sending event: pluginload
inline: adding item field multidisc
inline: adding item field vinylorcassette
inline: adding item field vinyl
inline: adding item field cassette
inline: adding item field bootleg
inline: adding item field ep
inline: adding item field single
inline: adding item field live
library database: /run/media/matt/3TB-Drive-01/Audio/.musiclibrary.db
library directory: /run/media/matt/3TB-Drive-01/Audio
Sending event: library_opened
Sending event: import_begin
Extracting archive: /run/media/matt/3TB-Drive-01/MonsterCat/music/Various Artists/Monstercat Uncaged Vol. 5 (FLAC).zip
Archive extracted to: /tmp/tmpiww7lxpq
Sending event: import_task_created
Sending event: import_task_start
Looking up: /tmp/tmpiww7lxpq
Tagging Bossfight - Monstercat Uncaged Vol. 5
No album ID found.
Search terms: Bossfight - Monstercat Uncaged Vol. 5
Album might be VA: True
Searching for MusicBrainz releases with: {'release': 'monstercat uncaged vol. 5', 'artist': 'bossfight', 'tracks': '41'}
Requesting MusicBrainz release a832b001-af53-4211-be40-8b7674718e58
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (a832b001-af53-4211-be40-8b7674718e58)
Computing track assignment...
...done.
Success. Distance: 0.01
Requesting MusicBrainz release 6dd5c758-e441-4330-879b-a09f134df1e9
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (6dd5c758-e441-4330-879b-a09f134df1e9)
Computing track assignment...
...done.
Success. Distance: 0.03
Requesting MusicBrainz release dd8466c2-2151-4ca5-a42c-71ce8048811e
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 7 (dd8466c2-2151-4ca5-a42c-71ce8048811e)
Computing track assignment...
...done.
Success. Distance: 0.59
Requesting MusicBrainz release 39de5cd2-07ba-41cc-baa8-a0b93ff31fe8
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 8 (39de5cd2-07ba-41cc-baa8-a0b93ff31fe8)
Computing track assignment...
...done.
Success. Distance: 0.57
Requesting MusicBrainz release 5df838ec-da71-4324-99ae-3490fe153a5f
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 6 (5df838ec-da71-4324-99ae-3490fe153a5f)
Computing track assignment...
...done.
Success. Distance: 0.58
Searching for MusicBrainz releases with: {'release': 'monstercat uncaged vol. 5', 'arid': '89ad4ac3-39f7-470e-963a-56509c546377', 'tracks': '41'}
Requesting MusicBrainz release a832b001-af53-4211-be40-8b7674718e58
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (a832b001-af53-4211-be40-8b7674718e58)
Duplicate.
Requesting MusicBrainz release 6dd5c758-e441-4330-879b-a09f134df1e9
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (6dd5c758-e441-4330-879b-a09f134df1e9)
Duplicate.
Requesting MusicBrainz release dd8466c2-2151-4ca5-a42c-71ce8048811e
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 7 (dd8466c2-2151-4ca5-a42c-71ce8048811e)
Duplicate.
Requesting MusicBrainz release 39de5cd2-07ba-41cc-baa8-a0b93ff31fe8
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 8 (39de5cd2-07ba-41cc-baa8-a0b93ff31fe8)
Duplicate.
Requesting MusicBrainz release 5df838ec-da71-4324-99ae-3490fe153a5f
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 6 (5df838ec-da71-4324-99ae-3490fe153a5f)
Duplicate.
Evaluating 5 candidates.
/tmp/tmpiww7lxpq (41 items)
Sending event: before_choose_candidate
Correcting tags from:
Monstercat Uncaged Vol. 5
To:
Monstercat Uncaged, Vol. 5
URL:
https://musicbrainz.org/release/a832b001-af53-4211-be40-8b7674718e58
(Similarity: 98.9%) (media, tracks) (Digital Media, 2018, XW, Monstercat, MCUV5)
* What's Going Down -> What’s Going Down
* New Age (feat. Celldweller) -> New Age (title, artist)
* This Feeling (feat. Kalibwoy) -> This Feeling (title, artist)
* She's A Killer -> She’s a Killer
* Monster (feat. Panther) -> Monster (title, artist)
* In The Night (feat. Sullivan King) -> In the Night (title, artist)
* Knockin' -> Knockin’
* Uncaged Vol. 5 (Album Mix) -> Uncaged, Vol. 5 (album mix)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY? B
Sending event: import
Sending event: cli_exit
Skip Log (Does cleanup /tmp, added for reference):
$ beet -vv import -t Monstercat\ Uncaged\ Vol.\ 5\ \(FLAC\).zip
user configuration: /home/matt/.config/beets/config.yaml
data directory: /home/matt/.config/beets
plugin paths:
Sending event: pluginload
inline: adding item field multidisc
inline: adding item field vinylorcassette
inline: adding item field vinyl
inline: adding item field cassette
inline: adding item field bootleg
inline: adding item field ep
inline: adding item field single
inline: adding item field live
library database: /run/media/matt/3TB-Drive-01/Audio/.musiclibrary.db
library directory: /run/media/matt/3TB-Drive-01/Audio
Sending event: library_opened
Sending event: import_begin
Extracting archive: /run/media/matt/3TB-Drive-01/MonsterCat/music/Various Artists/Monstercat Uncaged Vol. 5 (FLAC).zip
Archive extracted to: /tmp/tmpbr_xb73t
Sending event: import_task_created
Sending event: import_task_start
Looking up: /tmp/tmpbr_xb73t
Tagging Bossfight - Monstercat Uncaged Vol. 5
No album ID found.
Search terms: Bossfight - Monstercat Uncaged Vol. 5
Album might be VA: True
Searching for MusicBrainz releases with: {'release': 'monstercat uncaged vol. 5', 'artist': 'bossfight', 'tracks': '41'}
Requesting MusicBrainz release a832b001-af53-4211-be40-8b7674718e58
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (a832b001-af53-4211-be40-8b7674718e58)
Computing track assignment...
...done.
Success. Distance: 0.01
Requesting MusicBrainz release 6dd5c758-e441-4330-879b-a09f134df1e9
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (6dd5c758-e441-4330-879b-a09f134df1e9)
Computing track assignment...
...done.
Success. Distance: 0.03
Requesting MusicBrainz release dd8466c2-2151-4ca5-a42c-71ce8048811e
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 7 (dd8466c2-2151-4ca5-a42c-71ce8048811e)
Computing track assignment...
...done.
Success. Distance: 0.59
Requesting MusicBrainz release 39de5cd2-07ba-41cc-baa8-a0b93ff31fe8
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 8 (39de5cd2-07ba-41cc-baa8-a0b93ff31fe8)
Computing track assignment...
...done.
Success. Distance: 0.57
Requesting MusicBrainz release 5df838ec-da71-4324-99ae-3490fe153a5f
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 6 (5df838ec-da71-4324-99ae-3490fe153a5f)
Computing track assignment...
...done.
Success. Distance: 0.58
Searching for MusicBrainz releases with: {'release': 'monstercat uncaged vol. 5', 'arid': '89ad4ac3-39f7-470e-963a-56509c546377', 'tracks': '41'}
Requesting MusicBrainz release a832b001-af53-4211-be40-8b7674718e58
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (a832b001-af53-4211-be40-8b7674718e58)
Duplicate.
Requesting MusicBrainz release 6dd5c758-e441-4330-879b-a09f134df1e9
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 5 (6dd5c758-e441-4330-879b-a09f134df1e9)
Duplicate.
Requesting MusicBrainz release dd8466c2-2151-4ca5-a42c-71ce8048811e
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 7 (dd8466c2-2151-4ca5-a42c-71ce8048811e)
Duplicate.
Requesting MusicBrainz release 39de5cd2-07ba-41cc-baa8-a0b93ff31fe8
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 8 (39de5cd2-07ba-41cc-baa8-a0b93ff31fe8)
Duplicate.
Requesting MusicBrainz release 5df838ec-da71-4324-99ae-3490fe153a5f
primary MB release type: album
secondary MB release type(s): compilation
Sending event: albuminfo_received
Candidate: Various Artists - Monstercat Uncaged, Vol. 6 (5df838ec-da71-4324-99ae-3490fe153a5f)
Duplicate.
Evaluating 5 candidates.
/tmp/tmpbr_xb73t (41 items)
Sending event: before_choose_candidate
Correcting tags from:
Monstercat Uncaged Vol. 5
To:
Monstercat Uncaged, Vol. 5
URL:
https://musicbrainz.org/release/a832b001-af53-4211-be40-8b7674718e58
(Similarity: 98.9%) (media, tracks) (Digital Media, 2018, XW, Monstercat, MCUV5)
* What's Going Down -> What’s Going Down
* New Age (feat. Celldweller) -> New Age (title, artist)
* This Feeling (feat. Kalibwoy) -> This Feeling (title, artist)
* She's A Killer -> She’s a Killer
* Monster (feat. Panther) -> Monster (title, artist)
* In The Night (feat. Sullivan King) -> In the Night (title, artist)
* Knockin' -> Knockin’
* Uncaged Vol. 5 (Album Mix) -> Uncaged, Vol. 5 (album mix)
[A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums,
Enter search, enter Id, aBort, eDit, edit Candidates, plaY? S
Sending event: import_task_choice
Removing extracted directory: /tmp/tmpbr_xb73t
Sending event: import
Sending event: cli_exit
Got it; thanks! Again, it's a surprisingly difficult problem… the "abort" pathway is supposed to shut everything down as quickly as possible, so it skips basically everything in the entire, complex import pipeline. We don't have much of a way to do "cleanup" tasks when the pipeline is aborted in this way. Any bright insights here would be appreciated…
I thought abort was more a "lets stop the import here" (I've used it in the past as skipping stuff would mark it as done in state.pickle) wheres i use skip to skip importing problematic imports while still continuing with an import que. If a user really needs to abort something then and there they would usually use the CTRL+C to terminate the process as fast as possible.
So my question is to you, what was your intended use case for abort and does it mesh with use case others use it for? Would anyone really complain if we spent a few seconds removing files from a directory we temporarily created during import?
Right, that would indeed be nice—to be clear, I would very much be in favor of the abort doing something sensible and removing stuff from /tmp! That would be the right thing to do. It's just surprisingly difficult to implement given our parallel pipeline architecture—immediately ending each coroutine after the current invocation is hard enough already, and selectively propagating a specific subset of currently-in-flight tasks through the pipeline stages would be quite tricky…
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Its still an issue, I don't think automatically closing an issue is an appropriate choice, just because it is a few months old doesn't mean the issue doesn't magically stop existing. I'd prefer its kept open, till it is at least documented in the documentation
I agree that this shouldn't be closed, I think that happened because the issue still had the needinfo
label. The latter probably wasn't fully justified anymore, as there's at least some actionable problems here. Some ideas:
Leaking tmp storage on abort could be resolved by registering atexit
handlers when unzipping. Handling scarce memory during the import is certainly more difficult as argued above by @sampsyo:
The unzipping stage would need to periodically poll to wait for space to free up.
That might just be a necessary evil in this case. It's indeed not very nice, since it also spreads out all the user interaction, but preferable to just crashing the import.
It's hard (impossible?) to predict how big the expanded files will be before expanding them.
If we track the amount of tmp
storage used for unzipping, and if there's a way to catch the out-of-memory errors, we could just determine whether a retry has a chance of success (and if the second attempt also runs OOM, skip the task). In a perfect world, it would be possible to resume unzipping instead of starting over, but I've no idea whether the unzipping methods we support allow this easily (I'd be surprised TBH)
There is a possibility of deadlock if /tmp fails to have enough space even for a single album.
As above, if we track our tmp
usage, and it's zero, we just quit.
Anybody have any bright ideas? I suppose we could look into "direct" importing from zip files, as @jackwilsdon mentioned and I mentioned long ago in #690 (comment), which would be awesome but difficult to implement.
That'd be the ideal solution I guess, but having some safeguards for compression formats where we can't do this would still be nice. FUSE mounting of archives is unfortunately also not really well-developed, and would require root...
I'll assign this to myself, since I think that at least the atexit
handlers should be implemented. I'll not tackle this immediately, so if someone has the time, please go ahead.
Thanks, @wisp3rwind. Indeed, the thing to do here is to figure out what concrete actions to take. Here are the options I see above, to summarize:
/tmp
and stall the import to wait for more free space. (Not an awesome solution, but it would kind of work most of the time. And would also be pretty complicated to implement—perhaps even more complicated than the "direct" solution above, given the need to rely on polling and estimates and other inconveniences.)atexit
is one way to do it, but even better would be to extend our pipeline
module to allow stuff to run when the pipeline itself exits—not requiring the entire Python process to quit before we clean up.)Given the substantial downsides of 2 relative to 1, I think it is probably not worth it IMO.
Doing
beet import *
on large-ish (approx 150 gigs of zipped music) causes the software to quickly fail as it has filled up all available/tmp
storageProblem
Original run where I encounted the error:
Re-run with
-vv
as per GitHub issue templateLed to this problem (Once a few files have been extracted and filled up
/tmp
:Setup
My configuration (output of
beet config
) is:Edit: Forgot to add some screenshots.