Open lesshaste opened 2 years ago
try https://www.chabad.org/5190041 (please leave this issue open, even if it works)
That shows much the same problem
(myenv) user@user-2020:~/Media$ youtube-dl --verbose https://www.chabad.org/5190041
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.chabad.org/5190041']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.8.10 (CPython) - Linux-5.13.0-51-generic-x86_64-with-glibc2.29
[debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7
[debug] Proxy map: {}
[generic] 5190041: Requesting header
WARNING: Could not send HEAD request to https://www.chabad.org/5190041: HTTP Error 503: Service Temporarily Unavailable
[generic] 5190041: Downloading webpage
ERROR: Unable to download webpage: HTTP Error 503: Service Temporarily Unavailable (caused by <HTTPError 503: 'Service Temporarily Unavailable'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/home/user/python/myenv/lib/python3.8/site-packages/youtube_dl/extractor/common.py", line 634, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/home/user/python/myenv/lib/python3.8/site-packages/youtube_dl/YoutubeDL.py", line 2288, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.8/urllib/request.py", line 531, in open
response = meth(req, response)
File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
response = self.parent.error(
File "/usr/lib/python3.8/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
The single video on the example page, which might appear above the heading "Origin and History of Kosher", appears to be the same one linked under "You may also be interested in...": https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm. The page and the linked video page contain this player code:
<script type="text/javascript">
$j(function() {
var player = new Co.MediaPlayer(Co.MediaPlayer.Instances.Length, "");
player.IsLocalEmbed = true;
player.CdnDomain = 'https://w2.chabad.org';
player.Domain = Co.Request.ServerName;
player.ArticleId = '5190041';
player.AvailableMediaTypes = {};
player.Width = Co.BrowserInfo.IsMobileDevice() ? "auto" : 'auto';
player.Height = Co.BrowserInfo.IsMobileDevice() ? "auto" : 'auto';
player.HideBanner = false;
player.AutoStart = false;
player.StartTime = 0 || player.StartTime;
player.AllowFullScreen = true;
player.DisableAutoplayFeature = false;
player.AvailableMediaTypes['html5'] = new Co.MediaPlayer.MediaInfo('html5', '11633639', Co.BrowserInfo.IsMobileDevice() ? "auto" : 'auto', Co.BrowserInfo.IsMobileDevice() ? "auto" : 'auto', [0, 0, 0]);
player.MediaInfo = Co.MediaInfo["item5190041"];
player.Load("PlayerArea-5190041");
});
</script>
This isn't understood by any of yt-dl's extractors as far as I know.
The linked video page also has an actual SWF video link that loads a "JewishTV" video player and might with the aid of a time machine have played the same video that the JS player above would get.
Additionally, the site has a CloudFlare block that causes the 503 error. wget --user-agent='Mozilla/5.0' ...
breaks the block but the equivalent option for yt-dl fails, with Py 2.7 and 3.9.
Dietary superstition fans will have to look elsewhere, though a PR is welcome as usual.
fwiw yt-dlp seems to bypass the 503 error
yt-dlp https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm --verbose
[debug] Command-line config: ['https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm', '--verbose']
[debug] User config "C:\Users\jaybu\AppData\Roaming\yt-dlp\config.txt": ['--ffmpeg-location', 'C:\\Users\\jaybu\\ffmpeg\\bin', '-P', 'C:\\Users\\jaybu\\youtube.dl', '--update', '--audio-quality', '0', '--write-subs', '--write-auto-subs', '--embed-subs', '--compat-options', 'no-keep-subs,no-live-chat']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.06.22.1 [a86e01e] (win32_exe)
[debug] Compatibility options: no-live-chat, no-keep-subs
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.19043-SP0
[debug] Checking exe version: "C:\Users\jaybu\ffmpeg\bin\ffmpeg" -bsfs
[debug] Checking exe version: "C:\Users\jaybu\ffmpeg\bin\ffprobe" -bsfs
[debug] exe versions: ffmpeg N-106498-g854615adf2-20220405 (setts), ffprobe N-106498-g854615adf2-20220405
[debug] Optional libraries: Cryptodome-3.14.1, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.06.22.1, Current version: 2022.06.22.1
yt-dlp is up to date (2022.06.22.1)
[debug] [generic] Extracting URL: https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm
[generic] What-Is-Kosher: Requesting header
WARNING: [generic] Falling back on generic information extractor.
[generic] What-Is-Kosher: Downloading webpage
[generic] What-Is-Kosher: Extracting information
[debug] Looking for video embeds
ERROR: Unsupported URL: https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm
Traceback (most recent call last):
File "yt_dlp\YoutubeDL.py", line 1427, in wrapper
File "yt_dlp\YoutubeDL.py", line 1497, in __extract_info
File "yt_dlp\extractor\common.py", line 647, in extract
File "yt_dlp\extractor\generic.py", line 4136, in _real_extract
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.chabad.org/multimedia/video_cdo/aid/5190041/jewish/What-Is-Kosher.htm
As a (very poor) workaround, this works:
for c in `seq 0 38`; do wget https://hls-vod-cdn.chabad.org/vod/_definst_/smil:smil_cache1/116/11633637.smil/media_b600000_$c.ts?v=21112226; done
for f in media_b600000_*; do echo "file '$f'" >> mylist.txt; done
sort -V mylist.txt > mylist1.txt
ffmpeg -f concat -safe 0 -i mylist1.txt -c copy test.mp4
In https://hls-vod-cdn.chabad.org/vod/_definst_/smil:smil_cache1/116/11633637.smil/media_b600000_$c.ts?v=21112226
, if 11633637
is the second param in new Co.MediaPlayer.MediaInfo(...)
and 116
is its first 3 characters, what is 21112226
?
Possibly you might be able to pipe the for
command with wget -q -O - ...
into ffmpeg -f ts -i - ...
, so avoiding the intermediate files?
At a guess v=21112226 refers to video 21112226? That is v means video.
Possibly, but this value doesn't seems to be set anywhere in the page JS. And 11633637
appears to be the video ID. Ofc, it's possible that there could be different IDs, say one for the content owner and one for the video hoster.
Checklist
Example URLs
Description