Marekkon5 / onetagger

Music tagger for Windows, MacOS and Linux with Beatport, Discogs, Musicbrainz, Spotify, Traxsource and many other platforms support.
https://onetagger.github.io/
GNU General Public License v3.0
614 stars 32 forks source link

Maximum Capability / File Count? #330

Open davidmacfadyen opened 9 months ago

davidmacfadyen commented 9 months ago

Dear OneTagger I've been running the program, which is impressive, on large data sets, close to 100,000 files. Sadly the program taps out midway and cannot proceed beyond the halfway point. Strange, however, that 50% would be exactly the furthest point - beyond which things freeze. I have reduced the folder size to 60K files, but still have the same issue. What's the maximum number of MP3s that the program can handle? If your own work knows that number it would save me a lot of time guessing! Thanks again for a very promising tool Yours David

Marekkon5 commented 9 months ago

1T should be able to handle large amounts of files, however it is not recommended. You should run it on smaller libraries / chunks and see if you don't see any problems with the tags.

Anyway, the possible reasons why it froze are:

Can you please send us the log? Settings > Advanced > Open data dir > onetagger.log. However we strongly recommend running on smaller libraries because it is hard to pinpoint what caused this with 100K tracks.

Also we recommend trying the latest development build from Actions tab since that has some possible fixes. Thanks

davidmacfadyen commented 9 months ago

Sadly because the program freezes, I have to force-shut (Mac, Sonoma) and then access the log. But I see it's over 700MB(!)

davidmacfadyen commented 9 months ago

https://www.dropbox.com/scl/fi/iddfsky5ycxtpfqliiz5w/onetagger.log?rlkey=fnmhzd4o745wshchk32t8f6n4&dl=0 Oh, and I am using the latest build from your site, downloaded less than a month ago Your helop is very much welcome here. And I would love a concrete/recommended task size, in terms of audio files!

Marekkon5 commented 9 months ago

Hello, thank you for the log file, although I probably won't be able to extract much info from a 700MB file, which is extreme. I highly recommend deleting the log file, and trying on smaller chunks / parts of library to see if you can reproduce the error. As for how big chunks: I recommend first trying only like several songs to see results, and only later try ex. 1K or 5K chunks. Also remember to make backups.

As for the version - only the full release versions are available on the website. To get the latest "development" version you have to download it from Actions tab on GitHub.

Thanks

Marekkon5 commented 9 months ago

The latest commit should have potential fixes of the issues from the log, you can try it by downloading the binary from Actions tab.

However I still recommend trying on smaller portions and doing backups.

SimonDedman commented 8 months ago

Hi folks, onetagger paused/hung on 2000 successful matches, 512 fails, of 43162 files. I'll try the new dev version now, cheers. Edit: tried yesterday's version on a subfolder (197 files): 5 mins in, no matches, no fails, nothing happening, can't cancel it, had to force kill. Now same deal with default v1.7.

I tried a different subfolder and that processed a few files (all of 1 subsubfolder) then stopped. However because onetagger doesn't process folders/files in order (e.g. from my alphabetical folder list it started at M), there's no way for me to deduce where it's failing.

I don't have Settings > Advanced > Open data dir > onetagger.log; closest I have is Settings > Preferences > Advanced, but it only has 2 client side options, nothing about logging.

davidmacfadyen commented 8 months ago

That's my issue, for sure. After a few mins the operation freezes entirely, without any ability to edit/pause. Killing the program is the only option. I am down to folders of <10K files, still happening. Random guess: it's not language-related matter, is it? I'd say a good 50% of my files are not in English (though the file- path-name may be, of course) The Time-clock nevers runs: that may be irrelevant too

SimonDedman commented 8 months ago

I suspect my "0-9 and symbols" folder might be crashing on band "!!!" as well... could be a character thing

davidmacfadyen commented 8 months ago

Because I have so much that's in Cyrillic, i.e. v obscure, I always have more fails than matches. That's one more piece of contextual info. I am hoping to use OneTagger for that reason, however. Discogs is the most useful source for me, and OneTagger's generic/stylistic fingerprint is especially useful for tracks that are not in English. It allows people to make sense of, group, investigate etc tracks they cannot understand. Fingers crossed we find a solution. I used to use Beatunes (from Germany), but that program, despite the great metadata it can embed in files is very, very, very slow indeed.

SimonDedman commented 8 months ago

Sounds like you have a lot of regex in your future!

Marekkon5 commented 8 months ago

Hello, could you please send the log file, so we can debug, thank you.

davidmacfadyen commented 8 months ago

I am running a new search right now and will then send the file, thanks! D

On Mon, Feb 19, 2024 at 4:58 PM Marekkon5 @.***> wrote:

Hello, could you please send the log file, so we can debug, thank you.

— Reply to this email directly, view it on GitHub https://github.com/Marekkon5/onetagger/issues/330#issuecomment-1953329311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJEZQ3LWSJ4PIKUGG5ERESDYUPYMLAVCNFSM6AAAAABC2RJXYGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTGMZDSMZRGE . You are receiving this because you authored the thread.Message ID: @.***>

davidmacfadyen commented 8 months ago

BTW, does the program accrue some kind of "cache" after many long searches that, ideally, should be deleted?

Marekkon5 commented 8 months ago

1T doesn't cache anything, only thing that would make sense to delete is the log file so it doesn't grow to MB sizes and it's easier to find relevant info later, but that's unneccesary.

davidmacfadyen commented 8 months ago

Hi folks, onetagger paused/hung on 2000 successful matches, 512 fails, of 43162 files. I'll try the new dev version now, cheers. Edit: tried yesterday's version on a subfolder (197 files): 5 mins in, no matches, no fails, nothing happening, can't cancel it, had to force kill. Now same deal with default v1.7.

I tried a different subfolder and that processed a few files (all of 1 subsubfolder) then stopped. However because onetagger doesn't process folders/files in order (e.g. from my alphabetical folder list it started at M), there's no way for me to deduce where it's failing.

I don't have Settings > Advanced > Open data dir > onetagger.log; closest I have is Settings > Preferences > Advanced, but it only has 2 client side options, nothing about logging.

Forgot to say here that the .log file is actually at Settings (gog icon) > general > "open data folder" button > onetagger.log

davidmacfadyen commented 8 months ago

I am now down to folders less than 15K, since we keep freezing. I'll tell you when I reach a workable number!

SimonDedman commented 8 months ago

Forgot to say here that the .log file is actually at Settings (gog icon) > general > "open data folder" button > onetagger.log

Thank you mate; I'll give this a go when I'm home

SimonDedman commented 8 months ago

onetagger.log

Now attached, cheers

Marekkon5 commented 8 months ago

According to the log you're just getting rate limited by Spotify. There has been some reports that Spotify did rate limits for times as high as 24h, so this is ig the new norm. You should try waiting and tagging smaller batches or wait for us to add artificial delays to potentially prevent this.

SimonDedman commented 8 months ago

Cheers. I'll try turning Spotify off and see how I get on without it.

SimonDedman commented 8 months ago

Turning off Spotify worked for me. I wonder if there's scope to order the tag info providers by speed, access freedom, and database size? So that it would try the whole list first by a free, quick, largest dbase, then the remainder by the next, then the remainder by the next, ending up on Spotify with only the remaining untagged files, i.e. a smaller total list?

Marekkon5 commented 8 months ago

You can order the platforms yourself by dragging and dropping them. We already included info such as speed or whether it needs auth in the platform cards, so anyone can order them however they want.

SimonDedman commented 8 months ago

Oh jeez sorry mate, totally didn't realise that. Nice work, thanks!