ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.17k stars 9.93k forks source link

Failing to parse JSON when downloading from pluralsight #30765

Open peppelin opened 2 years ago

peppelin commented 2 years ago

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-i', u'--cookies', u'/pluralsight.com_cookies.txt', u'--sleep-interval', u'60', u'https://app.pluralsight.com/library/courses/configuring-managing-kubernetes-storage-scheduling/table-of-contents', u'-o', u'./kubernetes/helm/configuring-managing-kubernetes-storage-scheduling/%(playlist_index)02d-%(title)s.%(ext)s', u'--verbose']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 2.7.16 (CPython) - Linux-5.10.63-v7+-armv7l-with-debian-10.11
[debug] exe versions: ffmpeg 4.1.8-0, ffprobe 4.1.8-0
[debug] Proxy map: {}
[pluralsight:course] configuring-managing-kubernetes-storage-scheduling: Downloading JSON metadata
[pluralsight:course] configuring-managing-kubernetes-storage-scheduling: Downloading JSON metadata
ERROR: configuring-managing-kubernetes-storage-scheduling: Failed to parse JSON  (caused by ValueError('No JSON object could be decoded',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 906, in _parse_json
    return json.loads(json_string)
  File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/pluralsight.py", line 467, in _real_extract
    course = self._download_course(course_id, url, course_id)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/pluralsight.py", line 93, in _download_course
    headers={'Referer': url})
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 899, in _download_json
    expected_status=expected_status)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 883, in _download_json_handle
    fatal=fatal), urlh
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 910, in _parse_json
    raise ExtractorError(errmsg, cause=ve)
ExtractorError: configuring-managing-kubernetes-storage-scheduling: Failed to parse JSON  (caused by ValueError('No JSON object could be decoded',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

WRITE DESCRIPTION HERE

I've tried to download some courses and the downloader is unable to parse the JSON. I've tried different computers and os's, and also, tried to download a course I downloaded some weeks ago, this time without success.

DaveBoltman commented 1 year ago

This affects me as well, and I'm sure many others, because it's Free Pluralsight Week 😀

Here is my report from the broken site template which I'm hoping helps someone fix the problem. Adding it to this existing open issue. See "## Description" below!

Checklist

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--username', 'PRIVATE', '--password', 'PRIVATE', 'https://app.pluralsight.com/library/courses/fiddler-everywhere-3-big-picture/table-of-contents', '-o', 'C:/Users/Dave/Videos/Pluralsight/%%(playlist)s/%%(chapter_number)02d - %%(chapter)s/%%(playlist_index)02d - %%(title)s.%%(ext)s', '--sleep-interval', '65', '--max-sleep-interval', '120', '--sub-lang', 'en', '--sub-format', 'srt', '--write-sub', '--verbose']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.19041
[debug] exe versions: none
[debug] Proxy map: {}
[pluralsight:course] fiddler-everywhere-3-big-picture: Downloading JSON metadata
[pluralsight:course] fiddler-everywhere-3-big-picture: Downloading JSON metadata
ERROR: fiddler-everywhere-3-big-picture: Failed to parse JSON  (caused by ValueError('Expecting value: line 1 column 1 (char 0)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 906, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 361, in raw_decode
ValueError: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 906, in _parse_json
  File "C:\Python\Python34\lib\json\__init__.py", line 318, in loads
  File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode
  File "C:\Python\Python34\lib\json\decoder.py", line 361, in raw_decode
ValueError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\YoutubeDL.py", line 815, in wrapper
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\YoutubeDL.py", line 836, in __extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 534, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\pluralsight.py", line 467, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\pluralsight.py", line 93, in _download_course
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 899, in _download_json
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 883, in _download_json_handle
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 910, in _parse_json
youtube_dl.utils.ExtractorError: fiddler-everywhere-3-big-picture: Failed to parse JSON  (caused by ValueError('Expecting value: line 1 column 1 (char 0)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

Trying to download a course from Pluralsight (because it's the free Pluralsight week!) I can access the course fine in Chrome (including videos) using my credentials. Nothing more, nothing less. I did search for existing similar issues, and they all have the "broken-IE" label, which doesn't make sense to me because I'm not using IE at all. Maybe youtube-dl is using it behind the scenes, which I don't know about. I'm not going to give you my Pluralsight credentials because you can sign up for free.

dirkf commented 1 year ago

IE stands for InfoExtractor in this context, that is, the site-specific module that gets the data out of the page. I agree that there is an element of nerd-view in the tag name, but no-one has previously complained AFAIK.

What is happening is that some JSON sought by the extractor is being returned as an empty page. Why this is happening needs research by someone who cares. The extractor has a _download_course() method which tries to get JSON details of the course in two ways; apparently the first one fails silently inside a try: block and the error is being reported by the second, older, tactic. The next step would be to see how the first attempt fails:

         try:
             return self._download_course_rpc(course_id, url, display_id)
-        except ExtractorError:
-            # Old API fallback
-            return self._download_json(
-                'https://app.pluralsight.com/player/user/api/v1/player/payload',
-                display_id, data=urlencode_postdata({'courseId': course_id}),
-                headers={'Referer': url})
+        except ExtractorError as e:
+            try:
+                # Old API fallback
+                return self._download_json(
+                    'https://app.pluralsight.com/player/user/api/v1/player/payload',
+                    display_id, data=urlencode_postdata({'courseId': course_id}),
+                    headers={'Referer': url})
+            except Exception:
+                raise e
dirkf commented 1 year ago

See https://github.com/ytdl-org/youtube-dl/issues/29234#issuecomment-1031382362.