Closed debagos closed 4 years ago
Hello debagos,
debagos writes:
It looks like Facebook had changed the Instagram profile page. I get a graphql key-error all the time...
[gallery-dl][debug] Version 1.10.1 [gallery-dl][debug] Python 3.6.8 - Linux-4.18.0-25-generic-x86_64-with-Ubuntu-18.10-cosmic [gallery-dl][debug] requests 2.22.0 - urllib3 1.22 [1/3] https://www.instagram.com/REDACTED/ [gallery-dl][debug] Starting DownloadJob for 'https://www.instagram.com/REDACTED/' [gallery-dl][debug] Updating urllib3 ciphers [instagram][debug] Using InstagramUserExtractor for 'https://www.instagram.com/REDACTED/' [instagram][info] Logging in as REDACTED [urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.instagram.com [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /accounts/login/ HTTP/1.1" 200 9511 [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /web/__mid/ HTTP/1.1" 200 28 [urllib3.connectionpool][debug] https://www.instagram.com:443 "POST /accounts/login/ajax/ HTTP/1.1" 200 296 [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /REDACTED/ HTTP/1.1" 200 None [instagram][error] An unexpected error occurred: KeyError - 'graphql'. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues . [instagram][debug] Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/gallery_dl/job.py", line 47, in run for msg in self.extractor: File "/usr/local/lib/python3.6/dist-packages/gallery_dl/extractor/instagram.py", line 36, in items for data in self.instagrams(): File "/usr/local/lib/python3.6/dist-packages/gallery_dl/extractor/instagram.py", line 205, in _extract_profilepage yield from self._extract_page(url, 'ProfilePage') File "/usr/local/lib/python3.6/dist-packages/gallery_dl/extractor/instagram.py", line 169, in _extract_page base_shared_data = shared_data['entry_data'][page_type][0]['graphql'] KeyError: 'graphql' [2/3] [...]
Thank you for fixing, wish you a great day, yours sincerely.
JFTR, at least public profiles seems to work (if also a public profile is problematic please share a possible non-redacted URL to reproduce this issue).
If noone beat me I'll try to investigate further later this UTC evening if I can find a private profile.
Thanks!
Leonardo Taccari writes:
[...]
JFTR, at least public profiles seems to work (if also a public profile is problematic please share a possible non-redacted URL to reproduce this issue).
If noone beat me I'll try to investigate further later this UTC evening if I can find a private profile. [...]
I couldn't reproduce it neither with a private profile (I have tried both gallery-dl 1.10.1 and latest Git HEAD (on NetBSD/evbarm and Python 3.7, but probably that's not important)). Can you please share more information?
At least by relooking at the verbose output
[gallery-dl][debug] Version 1.10.1 [gallery-dl][debug] Python 3.6.8 - Linux-4.18.0-25-generic-x86_64-with-Ubuntu-18.10-cosmic [gallery-dl][debug] requests 2.22.0 - urllib3 1.22 [1/3] https://www.instagram.com/REDACTED/ [gallery-dl][debug] Starting DownloadJob for 'https://www.instagram.com/REDACTED/' [gallery-dl][debug] Updating urllib3 ciphers [instagram][debug] Using InstagramUserExtractor for 'https://www.instagram.com/REDACTED/' [instagram][info] Logging in as REDACTED [urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.instagram.com [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /accounts/login/ HTTP/1.1" 200 9511 [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /web/__mid/ HTTP/1.1" 200 28 [urllib3.connectionpool][debug] https://www.instagram.com:443 "POST /accounts/login/ajax/ HTTP/1.1" 200 296 [urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /REDACTED/ HTTP/1.1" 200 None
The `None' is unespected, i.e. getting the profile account should return a response with several data.
I would expect as [.../3] something like:
[1/3] user:
(That's when invoking gallery-dl as
`gallery-dl -u
Actually it doesn't matter if public or private profile...
I started the same downloads again without authentication towards Instagram and it worked. So maybe the extractor isn't causing the problem here.
I reckon that the problem is cause by my password.
It contains a apostrophe and it was easier to use a config which contains the username and password, than escaping the apostrophe successfully. That's why I don't use the -u <your_username> -p <your_password>
method. I use --config <path>
instead.
My method worked fine for weeks, but now it seems like I'm not logged in anymore through Gallery-DL...
Edit: Is there a way to save a copy from the fetched document? Maybe that can tell use more about whats going on here...
debagos writes:
Actually it doesn't matter if public or private profile... I started the same downloads again without authentication towards Instagram and it worked. So maybe the extractor isn't causing the problem here. I reckon that the problem is cause by my password. It contains a apostrophe and it was easier to use a config which contains the username and password, than escaping the apostrophe successfully. That's why I don't use the
-u <your_username> -p <your_password>
method. I use--config <path>
instead. My method worked fine for weeks, but now it seems like I'm not logged in anymore through Gallery-DL...
Can you please retry to login again via the web browser and then retry to gallery-dl a profile as authenticated user?
At least after a couple of logins it seems that - when logging via the web browser - Instagram asks for a verification code that is sent via email and then should be filled in the form when logging in.
I have never hit that via gallery-dl but this could explain the problem you are seeing (that's just a wild guess though without inspecting the responses).
I created a local copy of this repo and now I'm fiddling around, trying to find the cause... I'm definitively logged in, but the extractor fails at
if 'entry_data' in shared_data:
base_shared_data = shared_data['entry_data'][psdf['page']][0]['graphql']
in extractor/instagram.py I will report back if I can fix it.
@debagos Did you manage to find anything? Does this error still exist?
If it does, could you add
from .. import util
util.dump_json(shared_data)
exit()
after
https://github.com/mikf/gallery-dl/blob/23251356cbc06d8d2477ea34e3e2fe4ed2f99c9e/gallery_dl/extractor/instagram.py#L93
and post the output here? (Maybe use pastebin or similar if its too long)
The contents of page
might also be interesting.
Sorry, I'm pretty busy at the moment... The problem still persists (v.1.10.3) and I did what you suggested @mikf. Thank you.
page
contains the whole Instagram site and I am logged in. Good.shared_data
contains the inner html from the window._sharedData
javascript. Good.shared_data
also contains the entry_data
json-field, which is checked in the following if statement if 'entry_data' in shared_data:
. Good.entry_data
only contains this: "ProfilePage": [{}]
!base_shared_data = shared_data['entry_data'][psdf['page']][0]['graphql']
throws an error. Extractor fails at this point because entry_data
has no graphql
field. Bad.page
and found a graphql
field inside a window.__additionalDataLoaded
javascript which seems to hold all the relevant informations the extractor needs.My assumption is that I am part of a canary/experimental group which gets a newer Instagram layout. My knowledge about python (or programming in general) is very low, so I am not able to resolve this problem by myself. Even if I post my page
content here, what about the other people with that old Instagram layout? I think the extractor will get pretty complex... What do you guys think, do you want to investigate further into this very specific problem or should we just wait and drink tea?
Thank you for the detailed response!
what about the other people with that old Instagram layout?
This would be handled by first checking if it's the "old" layout, i.e. if there is a graphql
field in the initial shared_data
, and otherwise it would switch to grabbing the data from window.__additionalDataLoaded
or something like that. Shouldn't be very complicated.
What do you guys think, do you want to investigate further into this very specific problem
Yes, I would really like to see in how your Instagram (data) layout differs from a "normal" one, so this can hopefully be fixed. You also don't have to post the contents of page
with your personal data out in the open. Sending an email or a PM on Gitter is a possibility as well.
Maybe related: https://github.com/instaloader/instaloader/issues/394
So in the last few days I have recently been getting this graphql
error. I have normally been able to download public and private profiles while logged in but it seems Instagram has changed something on their end? Perhaps the rollout of the new dark theme within their app? I'm not the best with coding so not quite sure what went wrong but have pre configured the .conf
file with the correct details in my /etc
directory as per the defaults.
Commands typed in to the terminal
gallery-dl --sleep 02 https://www.instagram.com/REDACTED/
Below is the output.
[gallery-dl][debug] Version 1.10.6
[gallery-dl][debug] Python 3.5.2 - Linux-5.0.0-32-generic-x86_64-with-Ubuntu-18.04-bionic
[gallery-dl][debug] requests 2.22.0 - urllib3 1.25.6
[gallery-dl][debug] Starting DownloadJob for 'https://www.instagram.com/REDACTED/'
[gallery-dl][debug] Updating urllib3 ciphers
[instagram][debug] Using InstagramUserExtractor for 'https://www.instagram.com/REDACTED/'
[instagram][info] Logging in as REDACTED
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.instagram.com:443
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /accounts/login/ HTTP/1.1" 200 9969
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /web/__mid/ HTTP/1.1" 200 28
[urllib3.connectionpool][debug] https://www.instagram.com:443 "POST /accounts/login/ajax/ HTTP/1.1" 200 412
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /REDACTED/ HTTP/1.1" 200 18597
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /p/REDACTED/ HTTP/1.1" 200 None
[instagram][error] An unexpected error occurred: KeyError - 'graphql'. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
[instagram][debug]
Traceback (most recent call last):
File "/snap/gallery-dl/865/lib/python3.5/site-packages/gallery_dl/job.py", line 47, in run
for msg in self.extractor:
File "/snap/gallery-dl/865/lib/python3.5/site-packages/gallery_dl/extractor/instagram.py", line 35, in items
for data in self.instagrams():
File "/snap/gallery-dl/865/lib/python3.5/site-packages/gallery_dl/extractor/instagram.py", line 427, in instagrams
'query_hash': 'f2405b236d85e8296cf30347c9f08c2a',
File "/snap/gallery-dl/865/lib/python3.5/site-packages/gallery_dl/extractor/instagram.py", line 269, in _extract_page
yield from self._extract_postpage(url)
File "/snap/gallery-dl/865/lib/python3.5/site-packages/gallery_dl/extractor/instagram.py", line 109, in _extract_postpage
media = shared_data['entry_data']['PostPage'][0]['graphql']['shortcode_media']
KeyError: 'graphql'
My own account now also has the new "layout" for Post pages it seems, and I've managed to implement a fix (https://github.com/mikf/gallery-dl/commit/5fa6ff04ddf1ef9145233237c635cce93b3a8687). But, as the commit message says, video downloads when logged in no longer work. Disabling downloader.ytdl.forward-cookies
works around that for public videos, but private videos aren't downloadable any more.
It looks like Facebook had changed the Instagram profile page. I get a graphql key-error all the time...
Thank you for fixing, wish you a great day, yours sincerely.