ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.75k stars 9.98k forks source link

Make parsed metadata variables available to output template #11747

Open jblachly opened 7 years ago

jblachly commented 7 years ago

What is the purpose of your issue?


Log

$ youtube-dl -v --write-description --write-info-json --write-annotations --write-thumbnail --all-subs --metadata-from-title "%(series)s, Season %(season_number)s: %(episode)s (Episode %(episode_number)s)" -o "%(series)s - S%(season_number)sE%(episode_number)s - %(episode)s.%(ext)s" --exec "touch {}.done"  http://www.pbs.org/wgbh/masterpiece/episodes/sherlock-s4-e3/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'--write-description', u'--write-info-json', u'--write-annotations', u'--write-thumbnail', u'--all-subs', u'--metadata-from-title', u'%(series)s, Season %(season_number)s: %(episode)s (Episode %(episode_number)s)', u'-o', u'%(series)s - S%(season_number)sE%(episode_number)s - %(episode)s.%(ext)s', u'--exec', u'touch {}.done', u'http://www.pbs.org/wgbh/masterpiece/episodes/sherlock-s4-e3/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.01.16
[debug] Python version 2.7.12 - Darwin-15.6.0-x86_64-i386-64bit
[debug] exe versions: none
[debug] Proxy map: {}
[pbs] Downloading JSON metadata
[pbs] sherlock-s4-e3: Downloading webpage
[pbs] sherlock-s4-e3: Downloading player page
[pbs] sherlock-s4-e3: Downloading widget/partnerplayer page
[pbs] sherlock-s4-e3: Downloading portalplayer page
[pbs] sherlock-s4-e3: Downloading hls-1080p-16x9 video url info
[pbs] sherlock-s4-e3: Downloading m3u8 information
[pbs] sherlock-s4-e3: Downloading mp4-baseline-16x9 video url info
[pbs] sherlock-s4-e3: Downloading hls-800k-16x9 video url info
[pbs] sherlock-s4-e3: Downloading m3u8 information
[pbs] sherlock-s4-e3: Checking http-1200k video URL
[pbs] sherlock-s4-e3: Checking http-6500k video URL
[pbs] sherlock-s4-e3: Checking http-4500k video URL
[pbs] sherlock-s4-e3: Checking http-2500k video URL
[pbs] sherlock-s4-e3: Checking http-800k video URL
[pbs] sherlock-s4-e3: Checking http-400k video URL
WARNING: There's no description to write.
[info] Writing video annotations to: NA - SNAENA - NA.annotations.xml
WARNING: There are no annotations to write.
[info] Writing video subtitles to: NA - SNAENA - NA.en.vtt
[info] Writing video description metadata as JSON to: NA - SNAENA - NA.info.json
[pbs] 2365931000: Downloading thumbnail ...
[pbs] 2365931000: Writing thumbnail to: NA - SNAENA - NA.jpg
[debug] Invoking downloader on u'http://d6uz0or6bt1ws.cloudfront.net/videos/masterpiece/9784d628-a82e-45c1-83c9-fb12f32d3184/278743/hd-1080p-mezzanine-16x9/95b03357_mast4703-16x9-mp4-6500k.mp4'
[download] Destination: NA - SNAENA - NA.mp4
[download] 100% of 4.12GiB in 34:52
[fromtitle] parsed series: Masterpiece - Sherlock
[fromtitle] parsed episode_number: 3
[fromtitle] parsed season_number: 4
[fromtitle] parsed episode: The Final Problem
[exec] Executing command: touch 'NA - SNAENA - NA.mp4'.done
<EOT>

Make parsed metadata variables (--metadata-from-title) available to output template (-o)

Presently, it appears as if the only metadata available to the output template engine is that returned by the _real_extract() function in the site-specific extractor.

So, in my example, PBS, the only real element available at the time the output template is processed is the title, id, and a few others. Indeed, if you look in the logs I pasted above it seems that the metadata extraction from title happens quite late, after files have been written.

To be general, this is not specific to the PBS extractor.

Desired behaviour: Make metadata variables extracted from the title also available to the file output template. Proposed fix: Move the processing of --metadata-from-title earlier (during initial extraction, before writing (m)any files).

While writing this report I noticed issue #11108 which is another manifestation of this same problem. Note at the end someone correctly identifies that the metadata-from-title is not available as a variable to the output template. In my view it would be more sensible to move this parsing earlier and make these variables available, instead of adding a whole 'nother parameter, --variables-from-title as suggested in #11108

I haven't digested enough of the codebase yet to make the change myself. Thanks for your consideration.

yan12125 commented 7 years ago

Proposed fix: Move the processing of --metadata-from-title earlier (during initial extraction, before writing (m)any files).

Yep that's the correct way.