meeb / tubesync

Syncs YouTube channels and playlists to a locally hosted media server
GNU Affero General Public License v3.0
1.89k stars 119 forks source link

Can tubesync choose a certain start date to schonize channel/playlist #208

Open yche2990 opened 2 years ago

yche2990 commented 2 years ago

I have follewed quite a few games channels on youtube. It is quite difficult to import large numbders of videos to ths app. Do we have an option to only now fetch/download videos from certain dates (to be chosen by me) posted on the channel/playlist and to skip any older dates videos?

Thank you.

meeb commented 2 years ago

Does the "download cap" option at the source level not do what you want? TubeSync has to index all the media on a channel to get the media items published date, but it will only download media within the download cap time range.

yche2990 commented 2 years ago

Does the "download cap" option at the source level not do what you want? TubeSync has to index all the media on a channel to get the media items published date, but it will only download media within the download cap time range.

Thank you. This is really helpful. I have another question. I see the media format have {yyyy_mmdd}{source}{title}{key}_{format}.{ext}. Can I customise this format and if my video title is in another language, what setting I should apply to change the naming?

meeb commented 2 years ago

You can indeed customise it, just edit that field using whatever valid {thingnames} you want. See the bottom of the form (scroll right down) for the full list of supported options. Language has no impact, {title} will work with any language for example.

fbartolini commented 1 year ago

On the original question. When indexing the media and calling YoutubeDL.extract_info(), wouldn't the option daterange work to filter out videos with upload dates earlier than the cap date defined on source?

meeb commented 1 year ago

No, because tubesync first calls extract_info(...) with extract_flat. This only really returns a list of videos on a channel or playlist with their IDs and not any metadata like the upload date so the filter wouldn't work here.

The only way to obtain the full metadata is to then do a full extract_info(...) once per media item which does then return the upload date. It's only then that tubesync has the date information to decide what and what not to download.

You can call extract_info(...) without extract_flat on a channel every time it is indexed, however this results in yt-dlp performing one HTTP request per page of media and one HTTP request per media item which for large channels results in thousands of HTTP requests and your IP quickly being throttled or blocked.

fbartolini commented 1 year ago

OK got it. Thanks! I noticed as well looking at the youtube api directly that if you use search, the cost of that call is 100 units!