ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.59k stars 9.97k forks source link

Authenticated GDC Vault 2019 content throws exception #20638

Open bbi-jimspoto opened 5 years ago

bbi-jimspoto commented 5 years ago

What is the purpose of your issue?


ERROR: Authentical Failure / Redirect

"Unsupported URL: https://www.gdcvault.com/login"

I'm able to access the requisite content in a web browser, but youtube-dl appears to fail when authenticating due to a login redirect (?)

As there appear to be known issues with the GDCVault extractor (2019), I'm using the generic extractor as advised in other reports.

Log Output: Failure

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--force-generic-extractor', u'--username', u'PRIVATE', u'--password', u'PRIVATE', u'--verbose', u'https://www.gdcvault.com/play/1025986/Creating-a-Deeper-Emotional-Connection']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2019.04.07
[debug] Python version 2.7.15 (CPython) - Windows-10-10.0.17763
[debug] exe versions: none
[debug] Proxy map: {}
[generic] Creating-a-Deeper-Emotional-Connection: Requesting header
[redirect] Following redirect to https://www.gdcvault.com/login
[generic] login: Requesting header
WARNING: Forcing on generic information extractor.
[generic] login: Downloading webpage
[generic] login: Extracting information
ERROR: Unsupported URL: https://www.gdcvault.com/login
Traceback (most recent call last):
  File "c:\python27\lib\site-packages\youtube_dl\extractor\generic.py", line 2337, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "c:\python27\lib\site-packages\youtube_dl\compat.py", line 2551, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "c:\python27\lib\site-packages\youtube_dl\compat.py", line 2540, in _XML
    parser.feed(text)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 1659, in feed
    self._raiseerror(v)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 1523, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 51, column 22
Traceback (most recent call last):
  File "c:\python27\lib\site-packages\youtube_dl\YoutubeDL.py", line 796, in extract_info
    ie_result = ie.extract(url)
  File "c:\python27\lib\site-packages\youtube_dl\extractor\common.py", line 529, in extract
    ie_result = self._real_extract(url)
  File "c:\python27\lib\site-packages\youtube_dl\extractor\generic.py", line 3320, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: https://www.gdcvault.com/login

As a control case, in contrast here is an example of output that works; the only difference in the passed arguments is the target URL. Unlike the above failure, the content at the below URL does not require authentication, subsequently youtube-dl is successful

Example Log Output: Success

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--force-generic-extractor', u'--username', u'PRIVATE', u'--password', u'PRIVATE', u'--verbose', u'https://www.gdcvault.com/play/1026496/-Marvel-s-Spider-Man']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2019.04.07
[debug] Python version 2.7.15 (CPython) - Windows-10-10.0.17763
[debug] exe versions: none
[debug] Proxy map: {}
[generic] -Marvel-s-Spider-Man: Requesting header
WARNING: Forcing on generic information extractor.
[generic] -Marvel-s-Spider-Man: Downloading webpage
[generic] -Marvel-s-Spider-Man: Extracting information
[Kaltura] 1_bln945rf: Downloading video info JSON
[Kaltura] 1_bln945rf: Downloading m3u8 information
[debug] Default format spec: best/bestvideo+bestaudio
[debug] Invoking downloader on u'http://cdnapi.kaltura.com/p/1670711/sp/167071100/playManifest/entryId/1_bln945rf/format/url/protocol/http/flavorId/1_tl9ywps0?referrer=aHR0cHM6Ly93d3cuZ2RjdmF1bHQuY29t'
[download] Destination: Marvel's Spider-Man' - A Technical Postmortem-1_bln945rf.mp4
[download] 100% of 1.61GiB in 00:27

Final Note: Failure Testing

A possible clue: note that when I attempt to provide bogus credentials, I get identical output as in the above two cases -- free content works, but secure content fails with the same URL error, perhaps indicating that authentication is altogether non-functional as it appears to make no difference

Note that I do have valid credentials, which work as expected when entering them directly onto the website and viewing content in the embedded player


remitamine commented 5 years ago

try first to use cookies with and without --force-generic-extractor. note that a fix for free content has been pushed upstream. --username and --password has not effect when it's used with --force-generic-extractor.

Lockie85 commented 5 years ago

I have the same issue. @bbi-jimspoto can you let me know when you have a solution and provide an example commander you ran, just to make my life easier. Would be very thankful.

bbi-jimspoto commented 5 years ago

@remitamine Thank you, that works!

It's not a fix, but it's a valid workaround using the generic extractor.

For anyone else following along at home (@Weggy) here's what I did:

You may need to refresh the contents of cookies.txt if re-authenticating

remitamine commented 5 years ago

It's not a fix, but it's a valid workaround using the generic extractor.

no, I'm referring to this change 118f7add3b9690884edb4dc887995f3815243c78 that fixes the extraction for free videos without using --force-generic-extractor, it should work as well with members-only videos using cookies.

rgarat commented 5 years ago

I tried that fix, and the generic extractor workaround, and it still fails, with a 403 forbidden error

this is a free stream that fails https://www.gdcvault.com/play/1025772/-Into-the-Breach-Design

and here is the log of when I tried the generic extractor workaround https://github.com/ytdl-org/youtube-dl/issues/20575#issuecomment-481032027

hope this helps

remitamine commented 5 years ago

@rgarat https://github.com/ytdl-org/youtube-dl/issues/20575#issuecomment-481195392.

Lockie85 commented 5 years ago

Hi,

I'm trying the following command:

youtube-dl --cookies "C:\youtube-dl\cookies.txt" --username username@domain.com --password goodPassword --force-generic-extractor https://www.gdcvault.com/play/1025792/-09-to-19-A

"cookies.txt" has been copied to that DIR and the URL is a random video I grabbed.

However, I am still getting the following:

ERROR: unable to download video data: HTTP Error 403: Forbidden

I also tried: youtube-dl --cookies "C:\youtube-dl\cookies.txt" --username username@domain.com --password goodPassword https://www.gdcvault.com/play/1025792/-09-to-19-A

But this gave me the following error:

ERROR: Unable to download webpage: HTTP Error 500: Internal Server Error (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

I'm sure I'm making a silly mistake, but could you help?

Many Thanks

remitamine commented 5 years ago

@Weggy the video does not require an account, should work --force-generic-extractor after selecting another format.

youtube-dl -f mp4-981 https://gdcvault.com/play/1025792/-09-to-19-A
[GDCVault] -09-to-19-A: Downloading webpage
[Kaltura] 0_hxhta2kr: Downloading video info JSON
[Kaltura] 0_hxhta2kr: Downloading m3u8 information
[download] Destination: 09 to '19 - A Decade of Approachability in Fighting Games-0_hxhta2kr.mp4
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100     1  100     1    0     0      1      0  0:00:01 --:--:--  0:00:01     1
  1  434M    1 5267k    0     0   645k      0  0:11:29  0:00:08  0:11:21  832k
Lockie85 commented 5 years ago

@remitamine Confused. It does require an account from what I can see. As I get the following:

C:\youtube-dl>youtube-dl -f mp4-981 https://gdcvault.com/play/1025792/-09-to-19-A
[GDCVault] -09-to-19-A: Downloading webpage
WARNING: [GDCVault] It looks like http://www.gdcvault.com/play/1025792 requires a login. Try specifying a username and password and try again.
WARNING: [GDCVault] Could not login.
ERROR: Unable to extract xml filename; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
remitamine commented 5 years ago

@Weggy, for now, use --force-generic-extractor, it won't be needed in the next version.

Lockie85 commented 5 years ago

@remitamine I got this working as you suggested youtube-dl --force-generic-extractor -f mp4-981 https://gdcvault.com/play/1025792/-09-to-19-A

But I was just using this as an easy example. I cant get one that requires a login to function. For example: youtube-dl --force-generic-extractor -f mp4-981 --username username@domain.com --password goodPassword https://www.gdcvault.com/play/1025973/Disintegrating-Meshes-with-Particles-in

remitamine commented 5 years ago

@Weggy read previous comments.