Closed RollingStar closed 3 years ago
Hi! Thanks for the details on this.
Can you tell whether beets is stuck communicating with MusicBrainz to get matches or doing the matching? I've found in the past that the matching algorithm itself can be the culprit in these situations with very large albums. Specifically, it seems from the log like it's getting stuck when it tries to match with a very large candidate album.
This might be testable, for example, by giving beets an MBID to search for and comparing the time to match a few different options in isolation.
I gave beets a MBID for the correct match (volume 6 of the set rather than the entire set) and the whole operation was quite manageable. Probably under 5 minutes, compared to the 1h45+ which I eventually cancelled.
How specifically do you propose I test the performance? I think this is the MBID that beets was hanging on, but I'm not sure. 03ce3f61-9294-4f5d-aace-a4687aad4229
edit: and keep in mind, my point for this issue is that no matter how optimized beets gets, there should be a time limit, because it is a more graceful handling of unpredictable edge cases.
Thanks for looking into it. And also for including the "dummy" file above, which does match the same albums.
Oddly, on my machine (an aging MacBook Pro), this only takes a few seconds per match. It's not exactly snappy, but it's also not in the tens of minutes. It seems like about half the time is being spent in track matching (i.e., on the CPU) and the other half is just downloading data from MusicBrainz.
So, let's try to narrow down what's taking so long on your machine. I just pushed a bunch of extra logging that should delimit when beets is trying to match metadata and when it's communicating with MB. It will be helpful to sort that out regardless of whether we add a timeout or just optimize that chunk of code so it can't take so long. Can you give the most recent source a try in verbose mode and report back about which phase takes the longest?
Hi! I'm having a similar problem with this release: https://musicbrainz.org/release/9bcd75dd-995e-482b-8ba7-1ef074d253de However, with very big releases (like a complete edition of bach) I don't have problems since I have only a fraction of the files. Complete releases with ~15 CDs like https://musicbrainz.org/release/b0326242-344d-4a07-a175-c36a1418b644 are handled without problems, even if all files are there. Also, when some files are double and some are lacking, beets is trying to put them in place of the missing files. Maybe there should be a check if a file is actually a duplicate. I would think that beets is trying all different configurations of files to check which one has the best fit. If you have 700 or 1000 files that you can combine, it makes a lot of combinations to try out. If it's not done already, it would make sense to check for a tag like 'tracknumber' to reduce the number of possible combinations.
The timeout is not implemented so far, I got beets searching for a few hours to match an album.
If the problem is the handling and comparison of too many candidates, could there be a way to give the ID on MusicBrainz as an input to get rid of the problem? Or set a timeout and after it has expired, prompt for an ID.
It's actually the time taken to evaluate a single candidate album, when that candidate itself is very large.
Still, if we already tell him: "this is the right release, now you have only to get the tracks at the right place", could that speed it up or is it exactly that step that is taking time?
Yes, disabling the track alignment procedure could work.
I still think a time limit (perhaps with a user prompt to keep looking) would help retain user-friendliness for unpredictable edge cases. This could be done as the only fix, or in addition to any other fixes.
I have the same problem with a multi-disc release. I ended up importing all the CDs individually, but now he considers that he has the same release several times, one for every import. Is there a possibility to merge them, so that he only has one release that contains all the CDs?
The most relevant issue about merging albums is https://github.com/beetbox/beets/issues/112.
Currently dealing with this issue for d51f64e5-1d52-4d40-8308-5c9f22c4af03
.
I'm having the same(?) issue with these releases: https://musicbrainz.org/series/aa610319-3b4a-4a26-a127-b5dc78c3b732 Beets just seems to be doing something forever. Any bigger release (e.g. dozens of discs, or hundreds of tracks) seems to be impossible to import using beets. Another such example would be this: https://musicbrainz.org/release-group/9d0636f7-e959-4a00-9787-9a23c07f9655
We're still dealing with this issue, an album of 260 tracks took over an hour before beets finally realized that there was no match. Would be very nice to have at least a timeout to keep this from happening needlessly.
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I still think a time limit (perhaps with a user prompt to keep looking) would help retain user-friendliness for unpredictable edge cases. This could be done as the only fix, or in addition to any other fixes.
The "what is blocking it" line works better on a huge project than this one. There aren't many devs, and even fewer devs know beets in detail.
The purpose of that line, in the case of these issues, is to hopefully realize what is preventing the issue from being labeled as a feature
or bug
and provide the necessary info to "upgrade" it from needinfo
status. So in the case of this issue, perhaps one blocking element is the request from @sampsyo:
So, let's try to narrow down what's taking so long on your machine. I just pushed a bunch of extra logging that should delimit when beets is trying to match metadata and when it's communicating with MB. It will be helpful to sort that out regardless of whether we add a timeout or just optimize that chunk of code so it can't take so long. Can you give the most recent source a try in verbose mode and report back about which phase takes the longest?
@RollingStar were you ever able to get around to this?
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I'm guessing this is still relevant as it wasn't ever fixed as far as I know.
On Thu, Sep 10, 2020, 1:51 PM stale[bot] notifications@github.com wrote:
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/beetbox/beets/issues/2446#issuecomment-690575418, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIQYQZKGXILQ6CYN5K7GKXDSFEG33ANCNFSM4DAWP4LA .
Part of why I haven't gone back and try to reproduce this specific bug with the massive Mozart box set is it would be plugging one potential bug. Enforcing a time-out would stop a whole class of odd behavior. Another reason is I eventually caved and imported the box set as sub-sets.
Would anybody who's affected by this issue be interested in doing the aforementioned digging to figure out the root cause (cf https://github.com/beetbox/beets/issues/2446#issuecomment-280982403, for example)?
Hi! I have a few releases that fall in the category 'too big to handle', but I'm a bit occupied right now.
Le jeu. 10 sept. 2020 à 23:52, Adrian Sampson notifications@github.com a écrit :
Would anybody who's affected by this issue be interested in doing the aforementioned digging to figure out the root cause (cf #2446 (comment) https://github.com/beetbox/beets/issues/2446#issuecomment-280982403, for example)?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/beetbox/beets/issues/2446#issuecomment-690751530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGIAODCD4DLAIL5VFO7FU3SFFDBXANCNFSM4DAWP4LA .
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Problem
If a candidate is a huge box set, beets slows to an absolute crawl trying to match tracks to it. There should be a cap on maximum lookup time, and beets should cancel a lookup thread that goes passed that time limit. Then it can provide the user with its candidates, including the interrupted lookups, and give its usual prompts, like (A)pply, but add a new one: something like (C)ontinue interrupted lookups. The time limit could default to something reasonable (10-60 seconds) and perhaps be user-specified in the config.
The import hangs (gets really slow but eventually finishes) on two attempts for the same files. Similar lookups took at least 25 minutes, but I haven't timed it exactly. edit: I started this command at least before posting this issue 1 hour and 45 minutes ago. Beets is still considering the entire boxset as a candidate and Python still shows CPU usage.
Running this command in verbose (
-vv
) mode:Led to this problem:
Here's a link to the music files that trigger the bug (if relevant): n/a, but this dummy file that matches the box set probably simulates the same effect. dummy - 01 - Mozart, Wolfgang Amadeus - 7 Menuette, KV 61b (65a) - 1. Menuett in G-dur.zip
Setup
My configuration (output of
beet config
) is: