ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
130.09k stars 9.81k forks source link

oreilly login page error #30884

Open sandygmaharaj opened 2 years ago

sandygmaharaj commented 2 years ago

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-u', u'PRIVATE', u'-p', u'PRIVATE', u'--verbose', u'--write-info-json', u'https://learning.oreilly.com/videos/linux-shell-scripting/9781789800906/']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 2.7.16 (CPython) - Linux-4.19.0-18-cloud-amd64-x86_64-with-debian-10.12
[debug] exe versions: ffmpeg 4.1.8-0, ffprobe 4.1.8-0
[debug] Proxy map: {}
[safari:course] Downloading login page
ERROR: An extractor error has occurred. (caused by KeyError(u'next',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 533, in extract
    self.initialize()
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 437, in initialize
    self._real_initialize()
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/safari.py", line 29, in _real_initialize
    self._login()
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/safari.py", line 51, in _login
    'https://api.oreilly.com', qs['next'][0])
KeyError: u'next'
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 547, in extract
    raise ExtractorError('An extractor error has occurred.', cause=e)
ExtractorError: An extractor error has occurred. (caused by KeyError(u'next',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

Issue seems to be with some changes on login page of oreilly site. youtube-dl is not able to login and hence can not proceed further.

dirkf commented 2 years ago

Use --cookies ... with a cookie file from your logged-in browser session, and don't use -u/--username ....

sandygmaharaj commented 2 years ago

Still have an error: [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'--cookies', u'./cookies.txt', u'--verbose', u'--write-info-json', u'https://learning.oreilly.com/videos/complete-bash-shell/9781800209695/'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2021.12.17 [debug] Python version 2.7.16 (CPython) - Linux-4.19.0-18-cloud-amd64-x86_64-with-debian-10.12 [debug] exe versions: ffmpeg 4.1.8-0, ffprobe 4.1.8-0 [debug] Proxy map: {} [safari:course] 9781800209695: Downloading course JSON [download] Downloading playlist: Complete Bash Shell Scripting [safari:course] playlist Complete Bash Shell Scripting: Collected 93 video ids (downloading 93 of them) [download] Downloading video 1 of 93 [safari:api] 9781800209695/video1_1: Downloading part JSON [safari] 9781800209695-video1_1: Downloading webpage ERROR: Unable to extract kaltura reference id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 815, in wrapper return func(self, *args, **kwargs) File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 836, in __extract_info ie_result = ie.extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 534, in extract ie_result = self._real_extract(url) File "/usr/local/bin/youtube-dl/youtube_dl/extractor/safari.py", line 147, in _real_extract webpage, 'kaltura reference id', group='id') File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 1012, in _search_regex raise RegexNotFoundError('Unable to extract %s' % _name) RegexNotFoundError: Unable to extract kaltura reference id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Btw, it was working fine with Oreilly a few days back.

adbprogrammer commented 2 years ago

Youtube-dl can't log in to learning.oreilly.com

C:\youtube-dl-master>youtube-dl -u xxxx -p xxxx -f "bestvideo[ext=mp4][height<=720]+bestaudio[ext=m4a]/best[ext=mp4]/best[ext=mp4][height<=720]" -v -i -c -w -o "E:\Oreilly\%(playlist)s\%(playlist_index)s-%(title)s.%(ext)s" --external-downloader aria2c.exe --all-subs --convert-subs srt --no-check-certificate --restrict-filenames https://learning.oreilly.com/videos/python-fundamentals/9780135917411/ [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', '-f', 'bestvideo[ext=mp4][height<=720]+bestaudio[ext=m4a]/best[ext=mp4]/best[ext=mp4][height<=720]', '-v', '-i', '-c', '-w', '-o', 'E:\d_v_2022\Oreilly\%(playlist)s\%(playlist_index)s-%(title)s.%(ext)s', '--external-downloader', 'aria2c.exe', '--all-subs', '--convert-subs', 'srt', '--no-check-certificate', '--restrict-filenames', 'https://learning.oreilly.com/videos/python-fundamentals/9780135917411/'] [debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252 [debug] youtube-dl version 2021.12.17 [debug] Python version 3.8.6 (CPython) - Windows-10-10.0.17763-SP0 [debug] exe versions: avconv v11.7, avprobe v11.7, ffmpeg git-2019-10-13-4f4334b, ffprobe git-2019-10-13-4f4334b [debug] Proxy map: {} [safari:course] Downloading login page ERROR: An extractor error has occurred. (caused by KeyError('next')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "c:\python38\lib\site-packages\youtube_dl\extractor\common.py", line 533, in extract self.initialize() File "c:\python38\lib\site-packages\youtube_dl\extractor\common.py", line 437, in initialize self._real_initialize() File "c:\python38\lib\site-packages\youtube_dl\extractor\safari.py", line 29, in _real_initialize self._login() File "c:\python38\lib\site-packages\youtube_dl\extractor\safari.py", line 51, in _login 'https://api.oreilly.com', qs['next'][0]) KeyError: 'next' Traceback (most recent call last): File "c:\python38\lib\site-packages\youtube_dl\extractor\common.py", line 533, in extract self.initialize() File "c:\python38\lib\site-packages\youtube_dl\extractor\common.py", line 437, in initialize self._real_initialize() File "c:\python38\lib\site-packages\youtube_dl\extractor\safari.py", line 29, in _real_initialize self._login() File "c:\python38\lib\site-packages\youtube_dl\extractor\safari.py", line 51, in _login 'https://api.oreilly.com', qs['next'][0]) KeyError: 'next'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "c:\python38\lib\site-packages\youtube_dl\YoutubeDL.py", line 815, in wrapper return func(self, *args, **kwargs) File "c:\python38\lib\site-packages\youtube_dl\YoutubeDL.py", line 836, in __extract_info ie_result = ie.extract(url) File "c:\python38\lib\site-packages\youtube_dl\extractor\common.py", line 547, in extract raise ExtractorError('An extractor error has occurred.', cause=e) youtube_dl.utils.ExtractorError: An extractor error has occurred. (caused by KeyError('next')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

prof-ninjason commented 2 years ago

Confirmed.

I refreshed my cookies. I removed username/password. I have evaluated my script into a single one-line command just to make sure it wasn't my own personal script. I also removed the last / in the URL just in case.

Still having a "next" error.

prof-ninjason commented 2 years ago

Use --cookies ... with a cookie file from your logged-in browser session, and don't use -u/--username ....

In order to archive more than just 2 minutes of video; there was a temporary fix found that when using both cookies & username/password, it would allow full-length archiving of the video.

ankurjain41282 commented 1 year ago

I am still facing the issue

robertgrubba commented 1 year ago

got the same error, fix posted by jcrochon doesn't work for me (using cookies and credentials as parameters doesn't work either).

hunter86bg commented 1 year ago

@dirkf , can you take a look into this one.It seems that on the second attempt (safari:api) there is no qs and thus the extractor can't identify next_uri.

epsilonSpider commented 1 year ago

@sandygmaharaj @ankurjain41282 @robertgrubba if you would like, can you check if this change addresses the issue here: https://github.com/epsilonSpider/youtube-dl/commit/d524ac1898ad680e3476be30e218f44e42961374 ?

hunter86bg commented 1 year ago

I managed to download a single video, so it looks better now.

hunter86bg commented 1 year ago

I managed to download a single video (length bigger than a minute).

Edit: I meant that I manage to download a whole course where at least one of the videos is bigger than a minute.

Kerruba commented 1 year ago

From my understanding, the problem stands in (at least) two places:

  1. If you don´t provide username/password through the options, the user is considered not logged in even if credentials are passed using cookies (https://github.com/ytdl-org/youtube-dl/blob/f24bc9272e9b74efc4c4af87c862f5f78921d424/youtube_dl/extractor/safari.py#L33)
        username, password = self._get_login_info()
        if username is None:
            return
  2. Apparently the is_logged function is not working properly too. This PR should fix this issue https://github.com/ytdl-org/youtube-dl/pull/31524

I tried locally to replace the is_logged function

        def is_logged(urlh):
            url = urlh.geturl()
            parsed_url = compat_urlparse.urlparse(url)
            return parsed_url.hostname.endswith('learning.oreilly.com') and (
                parsed_url.path.startswith('/home/')
                or (parsed_url.path == '/member/login/' and not parsed_url.query))

And this in combination with the -u someuser -p somepass --cookies <cookies-file> seems to solve the problem

dirkf commented 1 year ago

If the cookies file is valid, the -u ... -p ... should be unnecessary. The code at 1. is correct operation: not trying to log you in if you didn't ask for it. Probably the change at 2. is what is making the difference.

Kerruba commented 1 year ago

@dirkf this is not the behaviour I'm facing. I've updated my local safari extractor to include info about the login process and tried to download the file reported above both using username/password + cookies and only cookies. Note that cookies are valid. Here the diff in the code

diff --git a/youtube_dl/extractor/safari.py b/youtube_dl/extractor/safari.py
index 2cc665122..161cc94f2 100644
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@@ -31,14 +31,21 @@ class SafariBaseIE(InfoExtractor):
     def _login(self):
         username, password = self._get_login_info()
         if username is None:
+            self.to_screen('Not Logged in')
             return
+        self.to_screen('Using user {}'.format(username))

         _, urlh = self._download_webpage_handle(
             'https://learning.oreilly.com/accounts/login-check/', None,
             'Downloading login page')

         def is_logged(urlh):
-            return 'learning.oreilly.com/home/' in urlh.geturl()
+            url = urlh.geturl()
+            parsed_url = compat_urlparse.urlparse(url)
+            return parsed_url.hostname.endswith('learning.oreilly.com') and (
+                parsed_url.path.startswith('/home/')
+                or (parsed_url.path == '/member/login/' and not parsed_url.query))
+            # return 'learning.oreilly.com/home/' in urlh.geturl()

         if is_logged(urlh):
             self.LOGGED_IN = True

Here the results I get from the download

  1. Only valid cookies, no username/password provided
    
    > youtube-dl -o file_without_username_and_password.mp4 --playlist-items 1 --cookies /tmp/learning.oreilly.com_cookies.txt https://learning.oreilly.com/videos/complete-bash-shell/9781800209695/
    [safari:course] Not Logged in
    [safari:course] 9781800209695: Downloading course JSON
    [download] Downloading playlist: Complete Bash Shell Scripting
    [safari:course] playlist Complete Bash Shell Scripting: Collected 93 video ids (downloading 1 of them)
    [download] Downloading video 1 of 1
    [safari:api] Not Logged in
    [safari:api] 9781800209695/video1_1: Downloading part JSON
    [safari] Not Logged in
    [safari] 9781800209695-video1_1: Downloading webpage
    [Kaltura] 9781800209695-video1_1: Downloading webpage
    [Kaltura] 0_eiswe197: Downloading video info JSON
    [Kaltura] 0_eiswe197: Checking mp4-1512 URL
    [Kaltura] 0_eiswe197: Downloading m3u8 information
    [download] Destination: file_without_username_and_password.mp4
    [download] 100% of 8.44MiB in 00:01
    [download] Finished downloading playlist: Complete Bash Shell Scripting

ffmpeg -i file_without_username_and_password.mp4 2>&1 | grep "Duration"| cut -d ' ' -f 4 | sed s/,// 00:01:00.00

2. Here the results passing a totally made up username and password along with the valid cookies

youtube-dl --username someuser --password somepass -o file_with_username_and_password.mp4 --playlist-items 1 --cookies /tmp/learning.oreilly.com_cookies.txt https://learning.oreilly.com/videos/complete-bash-shell/9781800209695/ [safari:course] Using user someuser [safari:course] Downloading login page [safari:course] 9781800209695: Downloading course JSON [download] Downloading playlist: Complete Bash Shell Scripting [safari:course] playlist Complete Bash Shell Scripting: Collected 93 video ids (downloading 1 of them) [download] Downloading video 1 of 1 [safari:api] Using user someuser [safari:api] Downloading login page [safari:api] 9781800209695/video1_1: Downloading part JSON [safari] Using user someuser [safari] Downloading login page [safari] 9781800209695-video1_1: Downloading webpage [safari] 9781800209695-video1_1: Downloading kaltura session JSON [Kaltura] 9781800209695-video1_1: Downloading webpage [Kaltura] 0_eiswe197: Downloading video info JSON [Kaltura] 0_eiswe197: Checking mp4-1512 URL [Kaltura] 0_eiswe197: Downloading m3u8 information [download] Destination: file_with_username_and_password.mp4 [download] 100% of 189.76MiB in 00:34 [download] Finished downloading playlist: Complete Bash Shell Scripting

ffmpeg -i file_with_username_and_password.mp4 2>&1 | grep "Duration"| cut -d ' ' -f 4 | sed s/,// 00:17:32.25

As you can see, providing the username and password make the difference in the downloaded file lenght. I didn't dig into the code that much, but seems to me something wrong is going on.

dirkf commented 1 year ago

So those results are with the patch applied?

Maybe it's important to update the cookies by visiting the login page and then the login procedure is bypassed. Needs investigation.

Kerruba commented 1 year ago

Yeah the results are with the patch applied, sorry if it wasn't clear. Yeah need some more investigation, I'll try to find some time to do that in the next couple of days hopefully