Closed wolferikg closed 5 years ago
CBSNEWS changed their videos urls from
https://www.cbsnews.com/videos/126-cbs-evening-news-2/
to
https://www.cbsnews.com/video/126-cbs-evening-news-2/
Line 14 in https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/cbsnews.py
Needs to be changed from...
_VALIDURL = r'https?://(?:www.)?cbsnews.com/(?:news|videos)/(?P
To...
_VALIDURL = r'https?://(?:www.)?cbsnews.com/(?:news|video)/(?P
I think the change on their end goes a little deeper than that. I tried exactly what you suggested and unfortunately after I make that change youtube-dl is unable to extract playlist JSON info.
Before change: youtube-dl $ ./youtube-dl http://www.cbsnews.com/video/131-cbs-evening-news/ -F -v [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['http://www.cbsnews.com/video/131-cbs-evening-news/', '-F', '-v'] [debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2018.01.27 [debug] Python version 3.4.5 (CPython) - Linux-4.9.16-gentoo-x86_64-Intel-R-_Pentium-R-_CPUJ2900@_2.41GHz-with-gentoo-2.3 [debug] exe versions: ffmpeg N-86258-g5782e0b, ffprobe N-86258-g5782e0b, rtmpdump 2.4 [debug] Proxy map: {} [generic] 131-cbs-evening-news: Requesting header [redirect] Following redirect to https://www.cbsnews.com/video/131-cbs-evening-news/ [generic] 131-cbs-evening-news: Requesting header WARNING: Falling back on generic information extractor. [generic] 131-cbs-evening-news: Downloading webpage [generic] 131-cbs-evening-news: Extracting information [generic] 131-cbs-evening-news: Downloading m3u8 information [download] Downloading playlist: 1/31: CBS Evening News [generic] playlist 1/31: CBS Evening News: Collected 1 video ids (downloading 1 of them) [download] Downloading video 1 of 1 [info] Available formats for 131-cbs-evening-news: format code extension resolution note hls-202-0 mp4 320x180 202k , avc1.4d400d, mp4a.40.2 hls-202-1 mp4 320x180 202k , avc1.4d400d, mp4a.40.2 hls-466-0 mp4 640x360 466k , avc1.66.30, mp4a.40.2 hls-466-1 mp4 640x360 466k , avc1.66.30, mp4a.40.2 (best) [download] Finished downloading playlist: 1/31: CBS Evening News
After: youtube-dl $ ./youtube-dl http://www.cbsnews.com/video/131-cbs-evening-news/ -F -v [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['http://www.cbsnews.com/video/131-cbs-evening-news/', '-F', '-v'] [debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2018.01.27 [debug] Python version 3.4.5 (CPython) - Linux-4.9.16-gentoo-x86_64-Intel-R-_Pentium-R-_CPUJ2900@_2.41GHz-with-gentoo-2.3 [debug] exe versions: ffmpeg N-86258-g5782e0b, ffprobe N-86258-g5782e0b, rtmpdump 2.4 [debug] Proxy map: {} [cbsnews] 131-cbs-evening-news: Downloading webpage ERROR: Unable to extract playlist JSON info; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "./youtube-dl/youtube_dl/YoutubeDL.py", line 784, in extract_info ie_result = ie.extract(url) File "./youtube-dl/youtube_dl/extractor/common.py", line 438, in extract ie_result = self._real_extract(url) File "./youtube-dl/youtube_dl/extractor/cbsnews.py", line 91, in _real_extract 'playlist JSON info', group='json'), video_id)['state'] File "./youtube-dl/youtube_dl/extractor/common.py", line 794, in _search_regex raise RegexNotFoundError('Unable to extract %s' % _name) youtube_dl.utils.RegexNotFoundError: Unable to extract playlist JSON info; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
I'm running into this too, even with the 2018.03.10 version :-/
I actually found that if you visit the video's page and grab the API URL from CBSNEWS.defaultPayload.items.video
and use that URL in your command line then it works and grabs the vid π
Thanks for the hint @cfxd ... i finally went ahead and whipped up a quick script, that seems to work pretty well. One needs to install 'jq' for this to work (https://stedolan.github.io/jq/).
wolf$ cat cbsnews.sh
#!/bin/bash
usage () {
echo "$(basename $0) Usage:"
echo "$(basename $0) <URL> [-d]"
echo " -d // dry run: print video-url and exit."
echo ""
exit 2
}
if [ $# -lt 1 ] ;then usage ;fi
episode=$1
baseurl='https://www.cbsnews.com'
output=$(echo $episode | awk -F/ '{print $5".mp4"}')
json=$(curl -s $episode | grep CBSNEWS.defaultPayload | head -1 | awk -F' = ' '{print $2}')
video=$(echo $json | jq '.items|.[0].video' | sed 's/_phone.m3u8/_tablet.m3u8/g' | sed 's/"//g')
videourl=$baseurl$video
if [ $2 = "-d" ]
then
echo "Video-URL: $videourl"
else
echo "Attempting to download $videourl to $output ..."
youtube-dl -o $output $videourl
fi
The substitution of _phone.m3u8 with _tablet.m3u8 was a "wild" guess πand will pull the high res version π
Maybe someone with more programming skills can use this as a base to submit a patch to fix this issue directly in the yt-dl cbsnews extractor?
Cheers.
Six months later, this is still broken as of version 2018.07.21.
I can confirm @wolferikg clever hack works well. Good job you just helped me with something. Much appreciated.
Great workaround. My previous workaround of opening the page source and looking for the 740.mp4 link doesn't work anymore, but this seems to.
dead again
looks like the change script is appending cbsnews.com to the url when it already exists on the url, so taking the error message and passing it through youtube-dl manually works
Hi all, I tried to download https://www.cbsnews.com/news/how-the-danske-bank-money-laundering-scheme-involving-230-billion-unraveled-60-minutes-2019-05-19 and failed. Opened DevTools and have spotted a sequence of "akamaihd" urls like https://devicecbsnews-a.akamaihd.net/media/mpx/2019/05/19/1524617283782/0519_60Minutes_Segment1_1853572_1200/0519_60Minutes_Segment1_1853572_1200_14.ts I can see a 7 or 8 extractors already know about "akamaihd" (francetv, lego, brightcove, senateisvp, livestream, nba, nhk, tvnow) so maybe we can fix cbsnews same way.
Please follow the guide below
x
into all the boxes [ ] relevant to your issue (like this:[x]
)Make sure you are using the latest version: run
youtube-dl --version
and ensure your version is 2018.01.21. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.bash-3.2$ youtube-dl --version 2018.01.21
Before submitting an issue make sure you have:
What is the purpose of your issue?
The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue
Download of CBS Evening News falls back to generic extractor and ends up grabbing the CBSN live stream instead:
$ youtube-dl https://www.cbsnews.com/video/122-cbs-evening-news/ -v [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'https://www.cbsnews.com/video/122-cbs-evening-news/', u'-v'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2018.01.21 [debug] Python version 2.7.9 (CPython) - Linux-3.16.0-4-amd64-x86_64-with-debian-8.9 [debug] exe versions: ffmpeg 2.6.9, ffprobe 2.6.9, rtmpdump 2.4 [debug] Proxy map: {} [generic] 122-cbs-evening-news: Requesting header WARNING: Falling back on generic information extractor. [generic] 122-cbs-evening-news: Downloading webpage [generic] 122-cbs-evening-news: Extracting information [generic] 122-cbs-evening-news: Downloading m3u8 information [download] Downloading playlist: 1/22: CBS Evening News [generic] playlist 1/22: CBS Evening News: Collected 1 video ids (downloading 1 of them) [download] Downloading video 1 of 1 [debug] Default format spec: bestvideo+bestaudio/best [debug] Invoking downloader on u'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/master_360.m3u8' [download] Destination: 1_22 - CBS Evening News-122-cbs-evening-news.mp4 [debug] ffmpeg command line: ffmpeg -y -loglevel verbose -headers 'Accept-Charset: ISO-8859-1,utf-8;q=0.7,;q=0.7 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip, deflate Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/*;q=0.8 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/47.0 (Chrome) ' -i 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/master_360.m3u8' -c copy -f mp4 '-bsf:a' aac_adtstoasc 'file:1_22 - CBS Evening News-122-cbs-evening-news.mp4.part' ffmpeg version 2.6.9 Copyright (c) 2000-2016 the FFmpeg developers built with gcc 4.9.2 (Debian 4.9.2-10) configuration: --prefix=/usr --extra-cflags='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security ' --extra-ldflags='-Wl,-z,relro' --cc='ccache cc' --enable-shared --enable-libmp3lame --enable-gpl --enable-nonfree --enable-libvorbis --enable-pthreads --enable-libfaac --enable-libxvid --enable-postproc --enable-x11grab --enable-libgsm --enable-libtheora --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libx264 --enable-libspeex --enable-nonfree --disable-stripping --enable-libvpx --enable-libschroedinger --disable-encoder=libschroedinger --enable-version3 --enable-libopenjpeg --enable-librtmp --enable-avfilter --enable-libfreetype --enable-libvo-aacenc --disable-decoder=amrnb --enable-libvo-amrwbenc --enable-libaacplus --libdir=/usr/lib/x86_64-linux-gnu --disable-vda --enable-libbluray --enable-libcdio --enable-gnutls --enable-frei0r --enable-openssl --enable-libass --enable-libopus --enable-fontconfig --enable-libpulse --disable-mips32r2 --disable-mipsdspr1 --disable-mipsdspr2 --enable-libvidstab --enable-libzvbi --enable-avresample --disable-htmlpages --disable-podpages --enable-libutvideo --enable-libfdk-aac --enable-libx265 --enable-libiec61883 --enable-vaapi --enable-libdc1394 --disable-altivec --shlibdir=/usr/lib/x86_64-linux-gnu libavutil 54. 20.100 / 54. 20.100 libavcodec 56. 26.100 / 56. 26.100 libavformat 56. 25.101 / 56. 25.101 libavdevice 56. 4.100 / 56. 4.100 libavfilter 5. 11.102 / 5. 11.102 libavresample 2. 1. 0 / 2. 1. 0 libswscale 3. 1.101 / 3. 1.101 libswresample 1. 1.100 / 1. 1.100 libpostproc 53. 3.100 / 53. 3.100 [hls,applehttp @ 0x1f277e0] HLS request for url 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/20180123T032018/master_360/00004/master_360_01525.ts', offset 0, playlist 0 [mpegts @ 0x1f2e160] parser not found for codec none, packets or times may be invalid. [mpegts @ 0x1f2e160] parser not found for codec timed_id3, packets or times may be invalid. [h264 @ 0x23d28c0] Current profile doesn't provide more RBSP data in PPS, skipping Last message repeated 2 times [mpegts @ 0x1f2e160] max_analyze_duration 5000000 reached at 5005000 microseconds [mpegts @ 0x1f2e160] Could not find codec parameters for stream 2 (Unknown: none ([134][0][0][0] / 0x0086)): unknown codec Consider increasing the value for the 'analyzeduration' and 'probesize' options [hls,applehttp @ 0x1f277e0] max_analyze_duration 5000000 reached at 5005000 microseconds [hls,applehttp @ 0x1f277e0] Could not find codec parameters for stream 2 (Unknown: none ([134][0][0][0] / 0x0086)): unknown codec Consider increasing the value for the 'analyzeduration' and 'probesize' options Input #0, hls,applehttp, from 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/master_360.m3u8': Duration: N/A, start: 56320.312000, bitrate: N/A Program 0 Metadata: variant_bitrate : 0 Stream #0:0: Video: h264 (Constrained Baseline) ([27][0][0][0] / 0x001B), yuv420p, 640x360 (640x368) [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc Stream #0:1: Audio: aac (LC) ([15][0][0][0] / 0x000F), 32000 Hz, stereo, fltp, 57 kb/s Stream #0:2: Unknown: none ([134][0][0][0] / 0x0086) Stream #0:3: Data: timed_id3 (ID3 / 0x20334449) Output #0, mp4, to 'file:1_22 - CBS Evening News-122-cbs-evening-news.mp4.part': Metadata: encoder : Lavf56.25.101 Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 640x360 (0x0) [SAR 1:1 DAR 16:9], q=2-31, 29.97 fps, 29.97 tbr, 90k tbn, 90k tbc Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 32000 Hz, stereo, 57 kb/s Stream mapping: Stream #0:0 -> #0:0 (copy) Stream #0:1 -> #0:1 (copy) Press [q] to stop, [?] for help [hls,applehttp @ 0x1f277e0] HLS request for url 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/20180123T032018/master_360/00004/master_360_01526.ts', offset 0, playlist 0 [NULL @ 0x23d28c0] Current profile doesn't provide more RBSP data in PPS, skipping Last message repeated 2 times [hls,applehttp @ 0x1f277e0] HLS request for url 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/20180123T032018/master_360/00004/master_360_01527.ts', offset 0, playlist 0 [NULL @ 0x23d28c0] Current profile doesn't provide more RBSP data in PPS, skipping ^C Last message repeated 2 times [hls,applehttp @ 0x1f277e0] HLS request for url 'https://cbsnhls-i.akamaihd.net/hls/live/264710-b/cbsn_hlsprod_2/20180123T032018/master_360/00004/master_360_01528.ts', offset 0, playlist 0 ^C ERROR: Interrupted by user
Description of your issue, suggested solution and other information
Download of CBS Evening News falls back to generic extractor and ends up grabbing the CBSN live stream instead. Tested from multiple boxes.