ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.38k stars 9.96k forks source link

[ThePlatform] Only ads are downloaded #26047

Closed rredford6 closed 6 months ago

rredford6 commented 4 years ago

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', '--hls-prefer-native', '--cookies', 'cookies.txt', 'https://player.theplatform.com/p/HNK2IC/<snip>#playerurl=https%3A//www.nbc.com/the-profit/video/an-inside-look-kota-longboards/4117074']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2020.06.06
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.17763
[debug] exe versions: ffmpeg N-91481-gb8c4d2b2ed, ffprobe N-91481-gb8c4d2b2ed
[debug] Proxy map: {}
[ThePlatform] 4117074: Downloading webpage
[ThePlatform] 4117074: Downloading SMIL data
[ThePlatform] 4117074: Downloading m3u8 information
[ThePlatform] 4117074: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://west.manifest.na.theplatform.com/m/HNK2IC/<snip>'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 381
[download] Destination: An Inside Look - KOTA Longboards-4117074.mp4
[download] 100% of 1.42GiB in 02:43
[debug] ffmpeg command line: ffprobe -show_streams "file:An Inside Look - KOTA Longboards-4117074.mp4"
[ffmpeg] Fixing malformed AAC bitstream in "An Inside Look - KOTA Longboards-4117074.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:An Inside Look - KOTA Longboards-4117074.mp4" -c copy -f mp4 "-bsf:a" aac_adtstoasc "file:An Inside Look - KOTA Longboards-4117074.temp.mp4"

Description

Videos downloaded from The Platform are not downloading correctly.

I'm using an MSO that is not supported. Therefore I have to manually get the link to the embedded player manually by visiting the video page:

https://www.nbc.com/the-profit/video/an-inside-look-kota-longboards/4117074

And then running the following JavaScript command:

document.getElementsByTagName('iframe')[0].src

And then pass that in to youtube-dl (see verbose log above).

Then inspect the downloaded file:

ffprobe "An Inside Look - KOTA Longboards-4117074.mp4" | grep Duration
  Duration: 00:06:00.71, start: 0.000000, bitrate: 3962 kb/s

And then fetch the expected duration:

youtube-dl --get-duration <same url as above>
43:14

There seems to be 37 minutes missing. However upon playing back the video it is nothing but ads. Note that --include-ads was not specified. Nothing from the video got downloaded; all 43 minutes are missing


If we manually inspect one of the m3u8 files listed in master.m3u8, we can see the following in there:

#EXT-X-VMAP-AD-BREAK:ID=mid_roll_3
<snip>
#EXT-X-VMAP-AD-BREAK-END

The lines between these comments need to be stripped before passing the data in to the downloader. Because right now the downloader is downloading them.


This issue is reproducable on all videos from nbc.com that require login. Any video not requiring login downloads fine.

rredford6 commented 1 year ago

@89z Try anything here >40 minutes https://www.nbc.com/shows/

rredford6 commented 5 months ago

If you have a fix please open a PR!

rredford6 commented 4 months ago

You could always post pseudo code here and a community member can integrate the changes

3052 commented 4 months ago

if its regarding NBC they moved to Widevine a while back. I have code for that but I believe Widevine support is banned from this repo. so you can visit my repo if you want