ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.42k stars 9.96k forks source link

Error parsing Instagram channel JSON #28059

Open pboettcher opened 3 years ago

pboettcher commented 3 years ago

JSON parsing error happens when downloading from Instagram. Just updated the application, the error is still there.

[debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['-o', '%(upload_date)s-%(id)s %(title)s.%(ext)s', '--verbose', '--download-archive', '.archive', '-i', 'https://www.instagram.com/metasensitive/'] [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] youtube-dl version 2021.01.24.1 [debug] Python version 3.4.4 (CPython) - Windows-7-6.1.7601-SP1 [debug] exe versions: ffmpeg git-2020-03-15-c467328, ffprobe git-2020-03-15-c467328 [debug] Proxy map: {} [instagram:user] metasensitive: Downloading webpage [download] Downloading playlist: metasensitive [instagram:user] metasensitive: Downloading JSON page 1 [instagram:user] metasensitive: Downloading JSON page 2 [instagram:user] metasensitive: Downloading JSON page 3 [instagram:user] metasensitive: Downloading JSON page 4 [instagram:user] metasensitive: Downloading JSON page 5 [instagram:user] metasensitive: Downloading JSON page 6 [instagram:user] metasensitive: Downloading JSON page 7 [instagram:user] metasensitive: Downloading JSON page 8 [instagram:user] metasensitive: Downloading JSON page 9 [instagram:user] metasensitive: Downloading JSON page 10 [instagram:user] metasensitive: Downloading JSON page 11 [instagram:user] metasensitive: Downloading JSON page 12 [instagram:user] metasensitive: Downloading JSON page 13 [instagram:user] metasensitive: Downloading JSON page 14 [instagram:user] metasensitive: Downloading JSON page 15 [instagram:user] metasensitive: Downloading JSON page 16 [instagram:user] metasensitive: Downloading JSON page 17 [instagram:user] metasensitive: Downloading JSON page 18 [instagram:user] metasensitive: Downloading JSON page 19 [instagram:user] metasensitive: Downloading JSON page 20 [instagram:user] metasensitive: Downloading JSON page 21 ERROR: metasensitive: Failed to parse JSON (caused by ValueError('Expecting value: line 1 column 1 (char 0)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\common.py", line 904, in _parse_json File "C:\Python\Python34\lib\json__init__.py", line 318, in loads File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode File "C:\Python\Python34\lib\json\decoder.py", line 361, in raw_decode ValueError: Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\common.py", line 904, in _parse_json File "C:\Python\Python34\lib\json__init__.py", line 318, in loads File "C:\Python\Python34\lib\json\decoder.py", line 343, in decode File "C:\Python\Python34\lib\json\decoder.py", line 361, in raw_decode ValueError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\YoutubeDL.py", line 806, in wrapper File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\YoutubeDL.py", line 838, in __extract_info File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\YoutubeDL.py", line 924, in process_ie_result File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\YoutubeDL.py", line 1021, in __process_playlist File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\instagram.py", line 308, in _extract_graphql File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\common.py", line 897, in _download_json File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\common.py", line 881, in _download_json_handle File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp_eeieglg\build\youtube_dl\extractor\common.py", line 908, in _parse_json youtube_dl.utils.ExtractorError: metasensitive: Failed to parse JSON (caused by ValueError('Expecting value: line 1 column 1 (cha r 0)',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

adrianheine commented 3 years ago

https://www.instagram.com/metasensitive/ requires a login for me, maybe that's the issue?

dwyart commented 3 years ago

I also get errors here https://www.instagram.com/anaide.rozam/, and this doesn't require a login in a browser.

adrianheine commented 3 years ago

https://www.instagram.com/anaide.rozam/ works for me.

dwyart commented 3 years ago

Most of the time, I get this:

[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/dw/youtube-dl/youtube-dl/__main__.py", line 19, in <module>
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 475, in main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 465, in _real_main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 2055, in download
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 799, in extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 806, in wrapper
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 838, in __extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 924, in process_ie_result
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 1020, in __process_playlist
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 283, in _extract_graphql
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 413, in _query_vars_for
KeyError: 'ProfilePage'

and rarely either of these:

[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
[instagram:user] anaide.rozam: Downloading JSON page 1
[instagram:user] anaide.rozam: Downloading JSON page 2
[instagram:user] anaide.rozam: Downloading JSON page 3
[instagram:user] anaide.rozam: Downloading JSON page 4
[instagram:user] anaide.rozam: Downloading JSON page 5
[instagram:user] anaide.rozam: Downloading JSON page 6
[instagram:user] anaide.rozam: Downloading JSON page 7
[instagram:user] anaide.rozam: Downloading JSON page 8
[instagram:user] anaide.rozam: Downloading JSON page 9
[instagram:user] anaide.rozam: Downloading JSON page 10
ERROR: anaide.rozam: Failed to parse JSON  (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
[instagram:user] anaide.rozam: Downloading JSON page 1
ERROR: anaide.rozam: Failed to parse JSON  (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
dwyart commented 3 years ago

This has been like this for maybe ~10 days, worked flawlessly everytime before.

adrianheine commented 3 years ago

Curious. With git HEAD, I get

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.instagram.com/anaide.rozam/', '-s', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.02.04.1
[debug] Git HEAD: 1641b1323
[debug] Python version 3.9.1+ (CPython) - Linux-5.10.0-2-amd64-x86_64-with-glibc2.31
[debug] exe versions: ffmpeg 4.3.1-8, ffprobe 4.3.1-8
[debug] Proxy map: {}
[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
[instagram:user] anaide.rozam: Downloading JSON page 1
[instagram:user] anaide.rozam: Downloading JSON page 2
[instagram:user] anaide.rozam: Downloading JSON page 3
[instagram:user] anaide.rozam: Downloading JSON page 4
[instagram:user] anaide.rozam: Downloading JSON page 5
[instagram:user] anaide.rozam: Downloading JSON page 6
[instagram:user] anaide.rozam: Downloading JSON page 7
[instagram:user] anaide.rozam: Downloading JSON page 8
[instagram:user] anaide.rozam: Downloading JSON page 9
[instagram:user] anaide.rozam: Downloading JSON page 10
[instagram:user] anaide.rozam: Downloading JSON page 11
[instagram:user] playlist anaide.rozam: Downloading 115 videos
[download] Downloading video 1 of 115
[Instagram] CK1nxS-Ck4a: Downloading webpage
[debug] Default format spec: bestvideo+bestaudio/best
[download] Downloading video 2 of 115
[Instagram] CJtgh1XC-_q: Downloading webpage
[…]

And with the version in Debian unstable:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.instagram.com/anaide.rozam/', '-s', '-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.01.08
[debug] Python version 3.9.1+ (CPython) - Linux-5.10.0-2-amd64-x86_64-with-glibc2.31
[debug] exe versions: ffmpeg 4.3.1-8, ffprobe 4.3.1-8
[debug] Proxy map: {}
[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
[instagram:user] anaide.rozam: Downloading JSON page 1
[instagram:user] anaide.rozam: Downloading JSON page 2
[instagram:user] anaide.rozam: Downloading JSON page 3
[instagram:user] anaide.rozam: Downloading JSON page 4
[instagram:user] anaide.rozam: Downloading JSON page 5
[instagram:user] anaide.rozam: Downloading JSON page 6
[instagram:user] anaide.rozam: Downloading JSON page 7
[instagram:user] anaide.rozam: Downloading JSON page 8
[instagram:user] anaide.rozam: Downloading JSON page 9
[instagram:user] anaide.rozam: Downloading JSON page 10
[instagram:user] anaide.rozam: Downloading JSON page 11
[instagram:user] playlist anaide.rozam: Downloading 115 videos
[download] Downloading video 1 of 115
[Instagram] CK1nxS-Ck4a: Downloading webpage
[debug] Default format spec: bestvideo+bestaudio/best
[download] Downloading video 2 of 115

You could try --write-pages and upload the result somewhere.

dwyart commented 3 years ago

This doesn't really go further:

$ youtube-dl --write-pages "https://www.instagram.com/anaide.rozam/"
[instagram:user] anaide.rozam: Downloading webpage
[instagram:user] Saving request to anaide.rozam_https_-_www.instagram.com_accounts_login_.dump
[download] Downloading playlist: anaide.rozam
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/dw/youtube-dl/youtube-dl/__main__.py", line 19, in <module>
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 475, in main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 465, in _real_main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 2055, in download
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 799, in extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 806, in wrapper
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 838, in __extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 924, in process_ie_result
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 1020, in __process_playlist
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 283, in _extract_graphql
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 413, in _query_vars_for
KeyError: 'ProfilePage'

Here is also what I get with -v :

$ youtube-dl -v "https://www.instagram.com/anaide.rozam/"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.instagram.com/anaide.rozam/']
[debug] Encodings: locale ISO-8859-1, fs iso8859-1, out iso8859-1, pref ISO-8859-1
[debug] youtube-dl version 2021.02.04.1
[debug] Python version 3.9.1+ (CPython) - Linux-5.10.0-3-amd64-x86_64-with-glibc2.31
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, phantomjs ., rtmpdump 2.4
[debug] Proxy map: {}
[instagram:user] anaide.rozam: Downloading webpage
[download] Downloading playlist: anaide.rozam
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/dw/youtube-dl/youtube-dl/__main__.py", line 19, in <module>
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 475, in main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/__init__.py", line 465, in _real_main
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 2055, in download
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 799, in extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 806, in wrapper
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 838, in __extract_info
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 924, in process_ie_result
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 1020, in __process_playlist
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 283, in _extract_graphql
  File "/home/dw/youtube-dl/youtube-dl/youtube_dl/extractor/instagram.py", line 413, in _query_vars_for
KeyError: 'ProfilePage'

All this is with git HEAD, and gives the same with Debian's version.

dwyart commented 3 years ago

Today, I am completely unable to reproduce the rare cases I quoted earlier (I had kept them around from earlier this week) ; the only error I keep getting is KeyError: 'ProfilePage'

adrianheine commented 3 years ago

anaide.rozam_https_-_www.instagram.com_accounts_login_.dump would be the interesting file :)

dwyart commented 3 years ago

Sorry, I missed that :)

I've put the file here: http://damien.wyart.free.fr/tmp/

ghost commented 3 years ago

Instagram error here as well:

macmini:~ steven$ youtube-dl https://instagram.com/gloriousabsenceofskill
[instagram:user] gloriousabsenceofskill: Downloading webpage
[download] Downloading playlist: gloriousabsenceofskill
[instagram:user] gloriousabsenceofskill: Downloading JSON page 1
ERROR: gloriousabsenceofskill: Failed to parse JSON  (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
macmini:~ steven$ youtube-dl https://instagram.com/gloriousabsenceofskill --verbose
[debug] System config: []
[debug] User config: ['-i', '--embed-subs', '--embed-thumbnail', '--add-metadata', '-o', '/Volumes/2 TB External SSD/Videos/%(uploader)s/%(upload_date)s %(title)s.%(ext)s', '-f', 'bestvideo+bestaudio[ext=m4a]/best', '--recode-video', 'mp4']
[debug] Custom config: []
[debug] Command-line args: ['https://instagram.com/gloriousabsenceofskill', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.05.16
[debug] Git HEAD: 76cfe88b7
[debug] Python version 3.9.5 (CPython) - macOS-11.4-arm64-arm-64bit
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4, rtmpdump 2.4
[debug] Proxy map: {}
[instagram:user] gloriousabsenceofskill: Downloading webpage
[download] Downloading playlist: gloriousabsenceofskill
[instagram:user] gloriousabsenceofskill: Downloading JSON page 1
ERROR: gloriousabsenceofskill: Failed to parse JSON  (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 906, in _parse_json
    return json.loads(json_string)
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 906, in _parse_json
    return json.loads(json_string)
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 847, in __extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 933, in process_ie_result
    return self.__process_playlist(ie_result, download)
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 1029, in __process_playlist
    entries = list(itertools.islice(
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/instagram.py", line 325, in _extract_graphql
    json_data = self._download_json(
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 895, in _download_json
    res = self._download_json_handle(
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 881, in _download_json_handle
    return self._parse_json(
  File "/opt/homebrew/Cellar/youtube-dl/2021.5.16/libexec/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 910, in _parse_json
    raise ExtractorError(errmsg, cause=ve)
youtube_dl.utils.ExtractorError: gloriousabsenceofskill: Failed to parse JSON  (caused by JSONDecodeError('Expecting value: line 1 column 1 (char 0)')); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
ghost commented 3 years ago

After some trial and error, I got it working by exporting the cookies using the ‘get cookies.txt’ chrome extension and then using the --cookies option in youtube-dl.

ghost commented 3 years ago

Well, that solution worked temporarily. It definitely appears Instagram is forcing user login after a random number of pulls in order to prevent web scraping.