Skip already downloaded videos

PotcFdk / youtube-sync

Script for maintaining an up-to-date offline mirror of a YouTube channel.

Apache License 2.0

44 stars 13 forks source link

Skip already downloaded videos #21

Closed mothlyme closed 3 years ago

mothlyme commented 3 years ago

Let youtube-dl generate a content list to keep track of already downloaded videos. This should considerably speed up the update of larger profiles

PotcFdk commented 3 years ago

Thanks for the PR. The current plan for the project is mostly written down here, I have just not had enough time to continue working on it. That post also mentions the use of download-archive and the goal that the file used should eventually get autogenerated based on the files that actually exist. To prevent certain videos from being downloaded, we can merge that autogenerated list with a manually edited file. So, the solution in this PR is probably not going to stay around in the long-term, but maybe it would be a great intermediate solution, given some conditions are met.

As for those conditions, one question: Have you measured if there is a considerable difference anyways? From memory, I know that it skips videos that would end up in a file that already exists, anyways. Does download-archive skip them in a faster way?

mothlyme commented 3 years ago

When youtube-dl downloads an entire youtube channel or playlist it first extracts all videos and goes through them one by one. Usually the video page with all metadata gets downloaded in order for youtube-dl to check if the output file (given that you can specify a custom file name with placeholders like the title) already exists and skips after this operation. This takes about a second for each video. With the download-archive option each downloaded video is added to the list file with each videos ID. Now even before trying to fetch more metadata a check happens if this video can be skipped. This reduces the total download time of a channel (given no new videos have been uploaded since the last run) to fetching the list of videos on that channel.

With my setup with of about 500 videos it reduces the runtime from 8 minutes to roughly 10 seconds in total with no new uploads.

PotcFdk commented 3 years ago

Alright, I guess that's a good enough speedup to merge this for the time being. Again, thanks for the contribution. This project is not dead, I just have a lot of stuff going on and lack the energy and motivation to implement all the new ideas laid out in the post. With little changes such as this, which seem innocent and should just work without causing too much harm, the temptation is big to just merge. However, I have no idea how many people actually use this and in case something breaks I do not wish to cause troubles where people will be relying on a broken state until I get around to getting it all back up and running again. So I hope you understand my delayed response and hesitation. I actually am second-guessing if the choice to make this project a bash script was a good one, or if something else, be it node.js, would make for better maintainable code and less friction with more complex features that I still want to see (even for my own use). Oh well, I'm derailing again. Merging.