ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.31k stars 9.95k forks source link

Improve omroep.nl support #12924

Open jpluimers opened 7 years ago

jpluimers commented 7 years ago

Please follow the guide below


Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2017.04.28. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

I'm running 2017.04.28 from Mac OS X homebrew.

Before submitting an issue make sure you have:

What is the purpose of your issue?


The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add -v flag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

$ youtube-dl -v https://www.omroepmax.nl/bloemencorso/uitzending/tv/bloemencorso-bollenstreek-zaterdag-29-april-2017/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'https://www.omroepmax.nl/bloemencorso/uitzending/tv/bloemencorso-bollenstreek-zaterdag-29-april-2017/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.04.28
[debug] Python version 2.7.10 - Darwin-16.5.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.3, ffprobe 3.3
[debug] Proxy map: {}
[generic] bloemencorso-bollenstreek-zaterdag-29-april-2017: Requesting header
WARNING: Falling back on generic information extractor.
[generic] bloemencorso-bollenstreek-zaterdag-29-april-2017: Downloading webpage
[generic] bloemencorso-bollenstreek-zaterdag-29-april-2017: Extracting information
ERROR: Unsupported URL: https://www.omroepmax.nl/bloemencorso/uitzending/tv/bloemencorso-bollenstreek-zaterdag-29-april-2017/
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1914, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML
    parser.feed(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 48, column 47
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 760, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 429, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2781, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://www.omroepmax.nl/bloemencorso/uitzending/tv/bloemencorso-bollenstreek-zaterdag-29-april-2017/

$ youtube-dl -v https://www.npo.nl/bloemencorso/POMS_S_MAX_083096
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'https://www.npo.nl/bloemencorso/POMS_S_MAX_083096']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.04.28
[debug] Python version 2.7.10 - Darwin-16.5.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.3, ffprobe 3.3
[debug] Proxy map: {}
[generic] POMS_S_MAX_083096: Requesting header
WARNING: Falling back on generic information extractor.
[generic] POMS_S_MAX_083096: Downloading webpage
[generic] POMS_S_MAX_083096: Extracting information
ERROR: Unsupported URL: https://www.npo.nl/bloemencorso/POMS_S_MAX_083096
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1914, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML
    parser.feed(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: mismatched tag: line 50, column 2
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 760, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 429, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2781, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://www.npo.nl/bloemencorso/POMS_S_MAX_083096

$ youtube-dl -v http://www.uitzendinggemist.net/aflevering/55627/Bloemencorso.html
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'http://www.uitzendinggemist.net/aflevering/55627/Bloemencorso.html']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.04.28
[debug] Python version 2.7.10 - Darwin-16.5.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.3, ffprobe 3.3
[debug] Proxy map: {}
[generic] Bloemencorso: Requesting header
WARNING: Falling back on generic information extractor.
[generic] Bloemencorso: Downloading webpage
[generic] Bloemencorso: Extracting information
ERROR: Unsupported URL: http://www.uitzendinggemist.net/aflevering/55627/Bloemencorso.html
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 1914, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2526, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2515, in _XML
    parser.feed(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: mismatched tag: line 1, column 1485
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 760, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 429, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2781, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://www.uitzendinggemist.net/aflevering/55627/Bloemencorso.html

Description of your issue, suggested solution and other information

Explanation of your issue in arbitrary form goes here. Please make sure the description is worded well enough to be understood. Provide as much context and examples as possible. If work on your issue requires account credentials please provide them or explain how one can obtain them.

It would be really nice if eventually youtube-dl can better drill-down the URLs for Dutch public broadcasting sites.

For instance, these video pages do not download directly from youtube-dl:

but closer inspection shows they can all be drilled down to this URL: https://content10c2b.omroep.nl/urishieldv2/l27m3bb0caff1116ae0b005904cc57000000.d66419bf63a23daa6c2fa66679e85561/ceresodi/h264/p/14/10/10/4d/std_POW_03545163.m4v?odiredirecturl=%2Fvideo%2Fida%2Fh264_std%2F2d0411147375c7311c1dffb16716abf8%2F5904cc57%2FPOW_03545163%2F1%3Ftype%3Djsonp%26callback%3D%3F%26callback%3DjsonpCallback1493486678350618

After manually drilling down, this workaround is OK:

youtube-dl https://content10c2b.omroep.nl/urishieldv2/l27m3bb0caff1116ae0b005904cc57000000.d66419bf63a23daa6c2fa66679e85561/ceresodi/h264/p/14/10/10/4d/std_POW_03545163.m4v?odiredirecturl=%2Fvideo%2Fida%2Fh264_std%2F2d0411147375c7311c1dffb16716abf8%2F5904cc57%2FPOW_03545163%2F1%3Ftype%3Djsonp%26callback%3D%3F%26callback%3DjsonpCallback1493486678350618

The HTML part to hunt for is like this:

<video class="jw-video jw-reset" x-webkit-airplay="allow" webkit-playsinline="" playsinline="" jw-loaded="data" src="https://content10c2a.omroep.nl/urishieldv2/l27m32184b3a0e52f134005904c88a000000.f5664b74ce20c14c10f24fb4a1a5464d/ceresodi/h264/p/14/10/10/4d/std_POW_03545163.m4v?odiredirecturl=%252Fvideo%252Fida%252Fh264_std%252F663d68edac522767b3f5d7c3e55f494a%252F5904c88a%252FPOW_03545163%252F1%253Ftype%253Djsonp%2526callback%253D%253F%2526callback%253DjsonpCallback1493485705164888" jw-played="" style="object-fit: fill;"></video>
GijsTimmers commented 6 years ago

Another example would be https://binnenstebuiten.kro-ncrv.nl/fragmenten/huis-met-open-constructie

GijsTimmers commented 6 years ago

@dstftw you being the maintainer of npo.py, would you consider adding support for these kind of urls?