ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.73k stars 10.07k forks source link

Unsupported URL on theplatform.eu #24795

Open runch-randa opened 4 years ago

runch-randa commented 4 years ago

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?form=html']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.03.24
[debug] Python version 2.7.13 (CPython) - Linux-4.19.105-feral-x86_64-with-debian-9.11
[debug] exe versions: ffmpeg 3.2.14-1, ffprobe 3.2.14-1
[debug] Proxy map: {}
[generic] bcl_4591?form=html: Requesting header
WARNING: Could not send HEAD request to https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?form=html: HTTP Error 405: Method Not Allowed
[generic] bcl_4591?form=html: Downloading webpage
WARNING: Falling back on generic information extractor.
[generic] bcl_4591?form=html: Extracting information
[generic] bcl_4591?autoPlay=false: Requesting header
WARNING: Could not send HEAD request to https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?autoPlay=false: HTTP Error 405: Method Not Allowed
[generic] bcl_4591?autoPlay=false: Downloading webpage
WARNING: Falling back on generic information extractor.
[generic] bcl_4591?autoPlay=false: Extracting information
ERROR: Unsupported URL: https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?autoPlay=false
Traceback (most recent call last):
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/extractor/generic.py", line 2375, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/compat.py", line 2551, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/compat.py", line 2540, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 11, column 97
Traceback (most recent call last):
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/media/28a0/kaloozas/pip/lib/python2.7/site-packages/youtube_dl/extractor/generic.py", line 3352, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?autoPlay=false

Description

I'm having issues trying to download from theplatform.eu, one of the supported websites on youtube-dl. Another user has brought up the same issue already almost half a year ago, but to no response. The video plays just fine on the browser, but I get the error above when I try to download using youtube-dl.

Example link: https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?form=html

willbeaufoy commented 4 years ago

The immediate reason is that youtube-dl is set up to use theplatform.com not theplatform.eu, e.g. we get past the initial error if we change .com to .eu here. However we then run into another error - I played around with it a bit but each time one error is fixed I run into another one.

I tried downloading the link you provided but replacing .eu with .com in the URL, but as I'm in Europe it wouldn't let me. I also tried with my VPN set to the US but for some reason this wouldn't work either ('Temporary failure in name resolution', which I don't think is to do with youtube-dl). So if anyone in the US/with a working VPN can try this command I'd be interested to know if it works:

youtube-dl -v --simulate https://player.theplatform.com/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?autoPlay=false

OrigamiEngineer commented 4 years ago

@willbeaufoy I don't believe theplatform.com and theplatform.eu have hosting parity (or this specific video is only hosted on the .eu site). I am in the US, but testing like you suggested did not work for me.

I was able to download the link @runch-randa was using, but only up to format http-1901; everything else error-ed out, and I'm not sure why. I was able to download in the highest quality by opening the link in my browser, retrieving the link to the video's master m3u8 playlist from the network monitor, and giving youtube-dl that link. This requires further testing, although my current thought is streams above a certain bit rate can't be downloaded directly and must be streamed. Possibly a low-level DRM? I will look into if I can fallback to downloading from the m3u8.

I changed all of the regex on the url to match .com and .eu. It seems to work, although this is my first time contributing to the project and I'm not sure if this is the best way to do things even though my changes passed flake8.

A number of other extractors depend on classes in theplatform extractor, and I'm not comfortable making a pull request until I've properly tested that my changes don't break those extractors. That being said, I have forked the repo and made a branch for theplatform. Here is a direct link to the modified 'theplatform' extractor. Please let me know if it works for you.

willbeaufoy commented 4 years ago

Hey Brendan, I cloned your repo and checked out the theplatform branch. When I run the following command:

python3 -m youtube_dl -v https://player.theplatform.eu/p/lCpzgC/dt-vp/embed/select/media/bcl_4591?autoPlay=false

It starts ok but then I get an error 'Did not get any data blocks'.

Looks like you've solved the URL issue at least though.