Tzahi12345 / YoutubeDL-Material

Self-hosted YouTube downloader built on Material Design
MIT License
2.6k stars 268 forks source link

[Reg: International Content] Retro-active lookups for things like Translated titles/descriptions, subs, thumbails if they failed before, possibly other auxiliary data. #139

Open GlassedSilver opened 4 years ago

GlassedSilver commented 4 years ago

Blocks: #107


A download may be passed when a video is fresh or a delay has passed (#134 ), but often times subs will be added by the community or the creator later on. Sometimes even weeks can pass!

To not miss any subs that may be available at the original source it'd be nice (but also for other data, see title) to keep looking for things we haven't downloaded yet. For whatever reason. Be it that it hadn't been available or that the internet connection died in the wrong moment, ...

However again, this would be most helpful for subs and translated titles that will often be added with a very healthy delay due to the nature of the task that is translating and timing subs for what can be an hour long video.

Shoutout to Kizuna Ai and Kaguya Luna }:3

Edit: One additional thought: I THINK this could be heavily tied into the path B approach to #134

Tzahi12345 commented 4 years ago

Another good idea, though this does come with the additional difficulty (as you suggested) that it will take an undetermined amount of time to translate the video.

We also don't know what failed. That is, if you add subtitle args to a subscription but those aren't available, we don't know that they didn't download. Actually, we don't know that it was asked for in the first place.

So this requires a couple things. First, going through the custom args and seeing if --write-thumbnail or --write-subs was requested, and make some sort of JSON-formatted list of requests which the subscription can have in the DB:

{
    "requested_subtitles": ["en", "de"],
    "requested_thumbnail": true
}

Then, we need to store when/if it's actually received.

{
    "available_subtitles": [
        {"lang": "de", "path": <lang_path>}
    ],
    "thumbnail": <thumbnail_path>
}

Also, this is probably gonna require a flag separate from fresh_download, as the check should continue to occur as long as the requested subtitles/thumbnails/whatever else are not retrieved. This shouldn't happen every time a subscription is checked (that seems excessive), maybe once every 10 checks? I think we'll have to flesh out this idea a bit more so it doesn't have so many moving parts. But you're right in that this is a fundamental problem that should be fixed -- sometimes there's missing info at the time of video upload, and we shouldn't rely on that incomplete video for newly uploaded videos (at least permanently).

GlassedSilver commented 4 years ago

This shouldn't happen every time a subscription is checked (that seems excessive), maybe once every 10 checks?

If a user wants to customize the check interval for subs this can easily mess up the Retro-Lookup/Retro-Care feature. (I'll just call it that from now on. Retro-Care sounds like a good term? It includes the caring aspect for previously issued downloads that are now retro-actively maintained and managed... Heck yeah I like this name!)

So maybe setting an interval for either by the user could be good?

I'd probably set it pretty tight myself, but having it customizable would mean that the Retro-Care feature could scale really well from small to large collections of content. If someone has a couple of subscriptions that are prone to upload deletions or something they might want to tighten the checks to not miss the smaller windows of opportunity.