Open willhughes-au opened 3 months ago
Hi, thanks for the issue.
The metadata must be downloaded first, media items cannot be downloaded without the metadata. The metadata contains the video and audio streams that are available, subtitle information etc.
You could potentially re-order this to download metadata first, then media, then thumbnails, but really this shouldn't be a big issue. The only way tubesync will ever take a long time processing thumbnails is as a one-off when you initially add a very large channel or playlist. Once it's added and performed the initial sync only incrementally added media will be indexed.
Re-ordering priorities for this would result in the very minor change of media downloading before thumbnails. Given thumbnails are quick to download, the trade-off here is if you added a massive channel your tubesync instance may be downloading media for a very long time while the interface just shows no thumbnails so this likely isn't worth it.
The only way tubesync will ever take a long time processing thumbnails is as a one-off when you initially add a very large channel or playlist. Once it's added and performed the initial sync only incrementally added media will be indexed.
Well this is always happening to me and TubeSync takes forever to download videos. I've set all my sources to 7 days otherwise it wouldn't download at all and keep scanning.
The way TubeSync works requires a think over because it's not viable. It's not great to not have any kind of progress and just keep your fingers crossed for this time to work. You wait a whole week and then nothing really worked.
Honestly @gravelfreeman that sounds like something else is wrong. What's in your tasks tab? Once the initial metadata has been downloaded only the metadata and thumbnails of new media are downloaded. Unless you're adding massive channels every few days what you're experiencing shouldn't be happening.
What specifically is taking days to complete? The video downloads themselves?
Honestly @gravelfreeman that sounds like something else is wrong. What's in your tasks tab? Once the initial metadata has been downloaded only the metadata and thumbnails of new media are downloaded. Unless you're adding massive channels every few days what you're experiencing shouldn't be happening.
What specifically is taking days to complete? The video downloads themselves?
It has downloaded 2800ish out of 3000 videos of a big channel. Then it stopped working. All other channels are stucked at 0. Right now I can't update TubeSync because I had my apps on Truenas which the catalog doesn't exist anymore. I'm soon updating my server to a better OS and will run kubernetes in a vm and finally be able to update again. I believe the TS version I'm currently on is broken.
If you're running an outdated version of tubesync that's almost certainly contributing to the issue. We have to track upstream releases of yt-dlp which does the actual downloading very closely and yt-dlp is regularly updated to work around limits placed by YouTube. It's likely then you are being throttled, blocked or otherwise limited.
To attempt to limit blocking, tubesync now only runs a single concurrent download at once so if a download is hung forever it would indeed block any other downloads from working.
Stop the container, update to the latest version and see if that helps.
If you're running an outdated version of tubesync that's almost certainly contributing to the issue. We have to track upstream releases of yt-dlp which does the actual downloading very closely and yt-dlp is regularly updated to work around limits placed by YouTube. It's likely then you are being throttled, blocked or otherwise limited.
To attempt to limit blocking, tubesync now only runs a single concurrent download at once so if a download is hung forever it would indeed block any other downloads from working.
Stop the container, update to the latest version and see if that helps.
Thanks for the reply. I was able to update and after clearing all tasks and leaving it for a few hours I got over 300 of these errors:
Error: "Failed to download media: 5PInY4qavRk (UUID: db19111d-5625-46ee-81b1-2fec42d46ba7) to disk, expected outfile does not exist: /downloads/video/archivesrc/2024-03-13_les-archives-de-radio-canada_la-situation-des-afros-canadiens-en-1966_5PInY4qavRk_720p-vp09-mp4a.mkv"
What's weird is that TubeSync can write to the folder /downloads/video/
because I can see the other files like:
2020-03-23_les-archives-de-radio-canada_le-8-mai-1982-gilles-villeneuve-meurt-lors-des-qualifications-du-grand-prix-de-b_LPaZCNI_BDk_1080p-avc1-mp4a.nfo
2020-03-23_les-archives-de-radio-canada_le-8-mai-1982-gilles-villeneuve-meurt-lors-des-qualifications-du-grand-prix-de-b_LPaZCNI_BDk_1080p-avc1-mp4a.jpg
2020-03-23_les-archives-de-radio-canada_le-8-mai-1982-gilles-villeneuve-meurt-lors-des-qualifications-du-grand-prix-de-b_LPaZCNI_BDk_1080p-avc1-mp4a.info.json
It's only missing the video files.
There should be another error in the logs above "expected outfile does not exist". That error basically just means "yt-dlp was called, and when it finished the expected video file doesn't exist on disk" so it can be anything really, YouTube blocked your IP, you've ran out of disk space, etc.
The initial error above should give a better idea what the actual error is.
Perhaps I'm using it wrong, but TubeSync seems to perform poorly when you add more than a handful of playlists/channels to it. I believe that, at least in part, this is caused by the Queue Priorities being not optimal.
Currently it seems that actually downloading media is the last thing that will happen, so the task queue grows massively with the various tasks that need to be completed.
What should happen, IMO is that the queue priorities should be inverted, or at least tweaked. Downloading media that's been discovered should be the highest priority, followed by thumbnails and metadata, with the discovery tasks being lowest priority.
Having the various task priorities read from env vars would do the trick, if you don't want to make this a global change for everyone.
Would you accept a PR to make these priorities read from env vars?