PotcFdk / youtube-sync

Script for maintaining an up-to-date offline mirror of a YouTube channel.
Apache License 2.0
44 stars 13 forks source link

Implement user-configurable format selection (with support for a list of sane presets of configuration combinations) #11

Open PotcFdk opened 6 years ago

PotcFdk commented 6 years ago

The current plan for everything related to formats, filenames and codecs is now located here

Old issue description:

9 has added a way to set the -f parameter for all downloads within a profile.

This is not exposed in the UI yet, requiring manual user intervention. Implement a way to deal with this from within the UI.

Also see https://github.com/PotcFdk/youtube-sync/issues/8#issuecomment-424353236

myrdd commented 6 years ago

I'd like to add: you probably want to add the -x option for audio-only profiles.

For audio-only, bestaudio+bestvideo (--format; see #9) needs to be changed to bestaudio. However, if the default format was simply best, no change of --format would be necessary, afaik. What do you think @PotcFdk?

Regarding audio-only, there might be several clean solutions:

PotcFdk commented 6 years ago

I think it's wise to keep specifying an -f outselves, especially when considering the fact that #7 will probably require this, too.

Using best is not what we want at all, because that option's behavior is defined as

Select the best quality format represented by a single file with video and audio.

YouTube doesn't serve the high-quality videos as single files but instead provides the audio and video in two different files (DASH). Hence, we need to make sure we grab the highest quality stream (audio and video) individually. youtube-dl does this by default since version 2015.04.26 by having -f bestvideo+bestaudio/best as the default format selection. See this in the paragraph that begins with Since the end of April 2015.

I feel like the META/mediatype one (or, rather, the already-merged #9 called it META/format) is the best type of solution, because it's the most expandable one.

Ideally, I'd love to allow the user to specify any format selection that's supported by youtube-dl, however, complex values like (mp4,webm)[height <=? 1080]+bestaudio/bestvideo[height <=? 720]+bestaudio/best might lead to difficulties with figuring out what youtube-dl will actually be doing (video? audio? both?) and how to proceed actually calling youtube-dl (which merge-output-format/file extension to use).

Perhaps it'd make sense to leave that to the user and define some easy profile preset for everyday UI usage (e.g. keeping the current default for video+audio plus adding an audio-only one) and let advanced users supply their own -f and --merge-output-format parameters?

What I mean by that is, for example, adding an optional audio-only thing to the setup command that pre-seeds the META/format with bestaudio;mka or something like that. If you wish to change that, you can then manipulate it as you wish, turn it into bestaudio[filesize>5M];mp3 or whatever else might be helpful.

I really need to think about this.

myrdd commented 6 years ago

Thanks for your answer and clarification!

Perhaps it'd make sense to leave that to the user and define some easy profile preset for everyday UI usage (e.g. keeping the current default for video+audio plus adding an audio-only one) and let advanced users supply their own -f and --merge-output-format parameters?

Sounds good to me!

Note the following:

PotcFdk commented 5 years ago

I have just discussed some ideas with @thrdroom and we ended up with the following battle-plan:

We will drop the fixed .mkv from the filename template.
This requires some careful considerations, that we have already discussed in private, but ultimately, I hope I will spot most issues during the implementation or during the implementation of the profile presets. I'll put a tl;dr after each paragraph for your convenience.

Dropping the file extension allows us to target many different format configurations. However, this comes with the cost of not having deterministic filenames anymore. Even within a single profile, a well-chosen -f can cause different file extensions from file to file. Thus, we will no longer pay attention to the file extension because it could be anything at all. This leads to a new problem: How can we make sure, that we don't end up with duplicates of videos? Formerly, we made sure each video ends up at a very specific file name - so youtube-dl skipped the video. Fortunately, youtube-dl allows us to deal with this using the --download-archive option, where you can pass a list of video IDs that shall not be downloaded again. Doing so, however, would prevent from missing videos to be redownloaded in case we delete the video or perhaps other cases. The solution to this issue would be that we regenerate the download-archive list based on the currently existing files each time before youtube-dl is called during an update. That way, missing files will be considered for download again by youtube-dl. In case the user wishes to ignore certain videos, we will provide a way to specify a secondary list that will be merged with the list of currently existing videos, so that they won't be redownloaded either. tl;dr: youtube-dl will drop the files with arbitrary file extensions as it deems fit based on your -f. They can vary from file-to-file. A download-archive file will be used to keep track of already-existing files.

Now with that out of the way, we can allow youtube-dl to drop non-mkvs, such as our old example goal... mp3s! The bad news is, that YouTube doesn't deliver mp3s, and if we just do -f bestaudio it might drop audio-only-.webm files. youtube-dl allows us to use -x along with --audio-format, though. We don't enforce any file extensions anymore, so we can let youtube-dl decide it. Thusly, tl;dr: we can give the user access to --audio-format (and probably also --recode video for the same reasons).

@thrdroom mentioned the desire to be able to remux the file to another container. So far, we've been attempting this with --merge-output-format, but this only works if there is actually a merge happening. Sometimes, it might be not. Sometimes, it might even change within the same profile. As already explained above, by dropping the file extensions from the template, youtube-dl will choose the correct one for us. Now, we can use --merge-output-format to make sure that, when merging, youtube-dl uses our desired container. What, if it sometimes doesn't merge? What, if we want to target always the same container? A simple example would be the case of an audio-only profile. Say, we end up with a bunch of .m4a files, .ogg files and .mp3 files. Assuming all of these fit inside an .mka file, the user might want to always end up with .mka files. Not that we require that behavior (already-downloaded-detection would be done by the download-archive, after all), but the user might have that desire for some reason. youtube-dl currently doesn't support specifying a target container for only remuxing without recoding. Hence, for this use case, we will allow the user to activate a post-youtube-dl ffmpeg step that remuxes from whatever youtube-dl dropped to the user's target container choice. This will be remux-only (-c copy), because we cannot reliably predict what youtube-dl will drop because it is based on what is downloaded with -f (which depends on external causes) along with user-choices such as --audio-format. Hence, it's the user's job to make sure such a custom remux config is compatible with whatever youtube-dl drops. When youtube-dl implements such functionality in the future, we can easily migrate to a better solution. tl;dr: for any supported youtube-dl configuration, you can activate a remux (non-recode) ffmpeg step that packs the original file's streams into your desired container.

In summary, as for the user-facing configurations that we can pass to youtube-dl:

In addition to those, a remux target container will be configurable. It will be passed to ffmpeg's output filename extension and also passed to youtube-dl via --merge-output-format, so that in case youtube-dl does merging itself, we can skip doing it ourselves.

Further considerations

18 will be resolved by this plan as we're not going to be forcing non-mkv-compatible subtitles into mkv containers (or really, we're not forcing any codec into any container! youtube-dl decides the output container and we allow it to set the correct filename extension).

The idea with presets that has been first mentioned two comments above this one can be altered to integrate nicely with this plan:

We will build a list into youtube-sync that is supposed to list all the popular use cases of youtube-sync. Two example configurations:

These presets will specify values for all the relevant configurations that have been explained in the first part of this comment. I will consider the presets to be the default operating modes of youtube-sync.

If you want another operating mode (e.g. "I want to download the best possible audio stream, convert it to mp3 and put it into an mka-container"), you will not be using one of the presets but instead you will specify the relevant configuration yourself. This will of course be documented. The goal, however, is to make sure that common choices are going to be represented by presets. This avoids user errors (e.g. youtube-dl dropping an mp3 file and then we try to remux that into a wav file).