Open Zalkota opened 4 years ago
Same here. Did you find a fix?
I have not found a fix.
Hey there ! KeyError: 'ProfilePage' line 34 Error - path to profile media not found How can I fix it?
Anyone figure it out?
The only thing I found was that Instagram redirects to login in production but somehow not in development. If you curl the Instagram feed you want to embed, you'll see that you get a 200 on localhost and a 300 on the server. At least that's what I got and that is why django-instgram can't find the context it needs. Haven't found a fix though.
hi guys i have exactly the same problem did you find a solution ?
Same issue here using Heroku
Hi, has anyone managed to find a solution?
I'm having the same Issue. The instagramUser in my example is 'leyendeckerbn' Looking deeper into it I see that I'm forwarded to the login page. Then there is no ProfilePage.
https://www.instagram.com:443 "GET /leyendeckers_bn/ HTTP/1.1" 302 0
https://www.instagram.com:443 "GET /accounts/login/?next=/leyendeckers_bn/ HTTP/1.1" 200 11288
profile['entry_data']
looks like this:
{'LoginAndSignupPage': [{'captcha': {'enabled': False, 'key': ''}, 'gdpr_required': False, 'tos_version': 'row', 'username_hint': ''}]}
Don't know how to fix that at the moment. Looking forward to more comments here.
I'll see if I can replicate the problem, it sounds like a change from the Instagram side is causing problems. If more people would do like @maxwhosevillage and send also the type of interrogation/user it would be more helpful.
Also add what type of configuration you are running for production.
Just to find out what is happening i simply did: wget https://www.instagram.com/leyendeckers_bn
On my local development-machine (MacOS) the output is:
➜ ~ wget https://www.instagram.com/leyendeckers_bn --2020-06-21 22:41:00-- https://www.instagram.com/leyendeckers_bn Resolving www.instagram.com (www.instagram.com)... 2a03:2880:f23f:e5:face:b00c:0:4420, 157.240.27.174 Connecting to www.instagram.com (www.instagram.com)|2a03:2880:f23f:e5:face:b00c:0:4420|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://www.instagram.com/leyendeckers_bn/ [following] --2020-06-21 22:41:00-- https://www.instagram.com/leyendeckers_bn/ Reusing existing connection to [www.instagram.com]:443. HTTP request sent, awaiting response... 200 OK Length: 35934 (35K) [text/html] Saving to: ‘leyendeckers_bn’ leyendeckers_bn 100%[===================>] 35.09K --.-KB/s in 0.02s
2020-06-21 22:41:01 (1.52 MB/s) - ‘leyendeckers_bn’ saved [130129]
On the production-maching (ubuntu18.04) i also get the redirect:
wget https://www.instagram.com/leyendeckers_bn --2020-06-21 22:42:21-- https://www.instagram.com/leyendeckers_bn Resolving www.instagram.com (www.instagram.com)... 31.13.84.174, 2a03:2880:f207:e5:face:b00c:0:4420 Connecting to www.instagram.com (www.instagram.com)|31.13.84.174|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://www.instagram.com/leyendeckers_bn/ [following] --2020-06-21 22:42:21-- https://www.instagram.com/leyendeckers_bn/ Reusing existing connection to www.instagram.com:443. HTTP request sent, awaiting response... 302 Found Cookie coming from www.instagram.com attempted to set domain to i.instagram.com Cookie coming from www.instagram.com attempted to set domain to i.instagram.com Location: https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ [following] --2020-06-21 22:42:21-- https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ Reusing existing connection to www.instagram.com:443. HTTP request sent, awaiting response... 200 OK Cookie coming from www.instagram.com attempted to set domain to i.instagram.com Cookie coming from www.instagram.com attempted to set domain to i.instagram.com Length: 45887 (45K) [text/html] Saving to: ‘leyendeckers_bn’ leyendeckers_bn 100%[===================>] 44.81K --.-KB/s in 0.02s
2020-06-21 22:42:21 (2.86 MB/s) - ‘leyendeckers_bn’ saved [45887/45887]
The idea that Instagram changed something seems to be true! Hope you/we can find a fix for that.
Hmm seems odd, question when redisplaying the images, are you linking them back to instagram? I wonder if somehow they are in fact doing something. Only ask as mine is working fine on the server, so I’m looking at what might be different to my setup and yours? I do link the image back to instagram. If you do too, then I’m outta ideas :(
:On 21 Jun 2020, at 21:47, Max Wessendorf notifications@github.com wrote:
Just to find out what is happening i simply did: wget https://www.instagram.com/leyendeckers_bn
On my local development-machine (MacOS) the output is:
➜ ~ wget https://www.instagram.com/leyendeckers_bn https://www.instagram.com/leyendeckers_bn --2020-06-21 22:41:00-- https://www.instagram.com/leyendeckers_bn https://www.instagram.com/leyendeckers_bn Resolving www.instagram.com http://www.instagram.com/ (www.instagram.com http://www.instagram.com/)... 2a03:2880:f23f:e5:face:b00c:0:4420, 157.240.27.174 Connecting to www.instagram.com http://www.instagram.com/ (www.instagram.com)|2a03:2880:f23f:e5:face:b00c:0:4420|:443 http://www.instagram.com)%7C2a03:2880:f23f:e5:face:b00c:0:4420%7C:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://www.instagram.com/leyendeckers_bn/ https://www.instagram.com/leyendeckers_bn/ [following] --2020-06-21 22:41:00-- https://www.instagram.com/leyendeckers_bn/ https://www.instagram.com/leyendeckers_bn/ Reusing existing connection to [www.instagram.com]:443. HTTP request sent, awaiting response... 200 OK Length: 35934 (35K) [text/html] Saving to: ‘leyendeckers_bn’ leyendeckers_bn 100%[===================>] 35.09K --.-KB/s in 0.02s 2020-06-21 22:41:01 (1.52 MB/s) - ‘leyendeckers_bn’ saved [130129]
On the production-maching (ubuntu18.04) i also get the redirect:
wget https://www.instagram.com/leyendeckers_bn https://www.instagram.com/leyendeckers_bn --2020-06-21 22:42:21-- https://www.instagram.com/leyendeckers_bn https://www.instagram.com/leyendeckers_bn Resolving www.instagram.com http://www.instagram.com/ (www.instagram.com http://www.instagram.com/)... 31.13.84.174, 2a03:2880:f207:e5:face:b00c:0:4420 Connecting to www.instagram.com http://www.instagram.com/ (www.instagram.com)|31.13.84.174|:443 http://www.instagram.com)|31.13.84.174|:443/... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://www.instagram.com/leyendeckers_bn/ https://www.instagram.com/leyendeckers_bn/ [following] --2020-06-21 22:42:21-- https://www.instagram.com/leyendeckers_bn/ https://www.instagram.com/leyendeckers_bn/ Reusing existing connection to www.instagram.com:443 http://www.instagram.com:443/. HTTP request sent, awaiting response... 302 Found Cookie coming from www.instagram.com http://www.instagram.com/ attempted to set domain to i.instagram.com Cookie coming from www.instagram.com http://www.instagram.com/ attempted to set domain to i.instagram.com Location: https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ [following] --2020-06-21 22:42:21-- https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ https://www.instagram.com/accounts/login/?next=/leyendeckers_bn/ Reusing existing connection to www.instagram.com:443 http://www.instagram.com:443/. HTTP request sent, awaiting response... 200 OK Cookie coming from www.instagram.com http://www.instagram.com/ attempted to set domain to i.instagram.com Cookie coming from www.instagram.com http://www.instagram.com/ attempted to set domain to i.instagram.com Length: 45887 (45K) [text/html] Saving to: ‘leyendeckers_bn’ leyendeckers_bn 100%[===================>] 44.81K --.-KB/s in 0.02s 2020-06-21 22:42:21 (2.86 MB/s) - ‘leyendeckers_bn’ saved [45887/45887]
The idea that Instagram changed something seems to be true! Hope you/we can find a fix for that.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/marcopompili/django-instagram/issues/27#issuecomment-647179738, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJKJHC3F45ZS7KXOWRJ4HDRXZWXLANCNFSM4NSFA6LQ.
Same here.
local development-machine:
Location: https://www.instagram.com/windfluechter_surfboards/ [following]
--17:31:28-- https://www.instagram.com/windfluechter_surfboards/
=> `index.html'
Resolving www.instagram.com... 69.171.250.174
Connecting to www.instagram.com[69.171.250.174]:443... connected.
HTTP request sent, awaiting response... 200 OK
production-machine:
Location: https://www.instagram.com/accounts/login/?next=/windfluechter_surfboards/ [following]
--2020-06-23 17:32:48-- https://www.instagram.com/accounts/login/?next=/windfluechter_surfboards/
Reusing existing connection to www.instagram.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 46579 (45K) [text/html]
Saving to: `index.html'
What I also noticed is that on the production machine ipv6 and local ipv4 is running. I hope somebody's got an idea.
I've got the same issue :/
Not sure if it's helpful to have another example, but here's what I'm getting.
local development machine (Mac running Django in Docker)
% wget https://www.instagram.com/tdwilson/
--2020-06-25 22:33:22-- https://www.instagram.com/tdwilson/
Resolving www.instagram.com (www.instagram.com)... 157.240.2.174
Connecting to www.instagram.com (www.instagram.com)|157.240.2.174|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 90473 (88K) [text/html]
Saving to: ‘index.html’
index.html 100%[===================>] 88.35K --.-KB/s in 0.06s
2020-06-25 22:33:23 (1.49 MB/s) - ‘index.html’ saved [90473/90473]
heroku instance
~ $ wget https://www.instagram.com/tdwilson/
--2020-06-26 03:35:18-- https://www.instagram.com/tdwilson/
Resolving www.instagram.com (www.instagram.com)... 31.13.66.174, 2a03:2880:f211:e5:face:b00c:0:4420
Connecting to www.instagram.com (www.instagram.com)|31.13.66.174|:443... connected.
GnuTLS: Resource temporarily unavailable, try again.
GnuTLS: Resource temporarily unavailable, try again.
HTTP request sent, awaiting response... 302 Found
Cookie coming from www.instagram.com attempted to set domain to i.instagram.com
Cookie coming from www.instagram.com attempted to set domain to i.instagram.com
Location: https://www.instagram.com/accounts/login/?next=/tdwilson/ [following]
--2020-06-26 03:35:18-- https://www.instagram.com/accounts/login/?next=/tdwilson/
Reusing existing connection to www.instagram.com:443.
HTTP request sent, awaiting response... 200 OK
Cookie coming from www.instagram.com attempted to set domain to i.instagram.com
Cookie coming from www.instagram.com attempted to set domain to i.instagram.com
Length: 45463 (44K) [text/html]
Saving to: ‘index.html’
index.html 100%[===================>] 44.40K --.-KB/s in 0.002s
2020-06-26 03:35:19 (20.0 MB/s) - ‘index.html’ saved [45463/45463]
So from what I'm reading online, it looks like those of us who want to display Instagram photos from public accounts are dead in the water unless we do it using Instagram's Basic Display API.
Is there any interest among the maintainers of this package to make those changes?
Some short research and possible temporary hotfix:
Ok so if I forward the headers of the client to the request it should stop redirecting to the login page?
Ok so if I forward the headers of the client to the request it should stop redirecting to the login page?
Yes, this will solve the problem.
So just to be clear, is this something that can be incorporated into a new release?
Addressed in commit: 2e30732afaad695bbbf2c40fdaf92515fd68346d
Changing UA
and Accept
should be enough, If anyone could test the master
branch on their prod env for confirmation so I would know if the fix works.
I have tried the master
branch, it works fine locally but still giving the same error in prod env.
2020-07-24T04:54:39.536728+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - path to profile media not found
2020-07-24T04:54:39.536738+00:00 app[web.1]: Traceback (most recent call last):
2020-07-24T04:54:39.536740+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media
2020-07-24T04:54:39.536741+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges']
2020-07-24T04:54:39.536742+00:00 app[web.1]: KeyError: 'ProfilePage'
I'm using a heroku app to deploy the web.
I have tried to implement this solution found on StackOverflow, but it didn't work.
...
try:
url_login = 'https://www.instagram.com/accounts/login/'
url_main = url_login + 'ajax/'
auth = {'username': os.environ.get('IG_USER'), 'password': os.environ.get('IG_PASSWORD')}
with requests.Session() as s:
req = s.get(url_login)
s.post(url_main, data=auth, headers={
'x-csrftoken': req.cookies['csrftoken'],
'referer': "https://www.instagram.com/accounts/login/",
'User-Agent': headers['User-Agent'],
'Accept': headers['Accept']
})
url = "https://www.instagram.com/{}/".format(username)
page = s.get(url, headers={
'User-Agent': headers['User-Agent'],
'Accept': headers['Accept']
})
# Raise error for 404 cause by a bad profile name
page.raise_for_status()
return html.fromstring(page.content)
...
"Extended" logs:
2020-07-24T05:50:48.163614+00:00 app[web.1]: Profile: {'LoginAndSignupPage': [{'captcha': {'enabled': False, 'key': ''}, 'gdpr_required': False, 'tos_version': 'row', 'username_hint': ''}]}
2020-07-24T05:50:48.164208+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - Profile: {'LoginAndSignupPage': [{'captcha': {'enabled': False, 'key': ''}, 'gdpr_required': False, 'tos_version': 'row', 'username_hint': ''}]}
2020-07-24T05:50:48.164209+00:00 app[web.1]: Traceback (most recent call last):
2020-07-24T05:50:48.164210+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media
2020-07-24T05:50:48.164210+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges']
2020-07-24T05:50:48.164214+00:00 app[web.1]: KeyError: 'ProfilePage'
2020-07-24T05:50:48.164389+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - path to profile media not found
2020-07-24T05:50:48.164390+00:00 app[web.1]: Traceback (most recent call last):
2020-07-24T05:50:48.164390+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media
2020-07-24T05:50:48.164391+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges']
2020-07-24T05:50:48.164394+00:00 app[web.1]: KeyError: 'ProfilePage'
I think the POST request sent with the auth param return a 400 response.
Thanks for your work @marcopompili . Let me know if more information is needed.
I found this explanation in a PHP Instagram scrapper Github gist. TLDR: Instagram "bans" IPs that make constant requests to the same URL. I'm not sure if it is right but it makes sense. Currently, I'm in a hurry so I decided to make the request on the front-end using js (example). Now it does retrieve the images in dev and prod env.
Maybe there is a way to make the request in the front-end (so the Instagram server gets our user IP and not the server IP for each request) and "receive" the response in the back-end so we can still use the handy templatetags and functionally of this Django app.
@marcopompili Any update on this issue?
Still this problem. I get (only in the product environment) the error message:
File "/env/lib/python2.7/site-packages/django_instagram/templatetags/instagram_client.py", line 28, in get_profile_media edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges']
TypeError: 'NoneType' object has no attribute 'getitem'
Hi folks, I'm facing the same issue in production. Any update or an idea how to solve? What you guys did ?
hi folks any fix yet?
Hi folks, I'm facing the same issue in production. Any update or an idea how to solve? What you guys did ?
My quick-fix solution was doing the scrapping over the front-end as I posted in https://github.com/marcopompili/django-instagram/issues/27#issuecomment-663360226.
I'm trying to find time to fork the repository and trying to implement a better solution.
I have tried the
master
branch, it works fine locally but still giving the same error in prod env.2020-07-24T04:54:39.536728+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - path to profile media not found 2020-07-24T04:54:39.536738+00:00 app[web.1]: Traceback (most recent call last): 2020-07-24T04:54:39.536740+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media 2020-07-24T04:54:39.536741+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges'] 2020-07-24T04:54:39.536742+00:00 app[web.1]: KeyError: 'ProfilePage'
I'm using a heroku app to deploy the web.
I have tried to implement this solution found on StackOverflow, but it didn't work.
... try: url_login = 'https://www.instagram.com/accounts/login/' url_main = url_login + 'ajax/' auth = {'username': os.environ.get('IG_USER'), 'password': os.environ.get('IG_PASSWORD')} with requests.Session() as s: req = s.get(url_login) s.post(url_main, data=auth, headers={ 'x-csrftoken': req.cookies['csrftoken'], 'referer': "https://www.instagram.com/accounts/login/", 'User-Agent': headers['User-Agent'], 'Accept': headers['Accept'] }) url = "https://www.instagram.com/{}/".format(username) page = s.get(url, headers={ 'User-Agent': headers['User-Agent'], 'Accept': headers['Accept'] }) # Raise error for 404 cause by a bad profile name page.raise_for_status() return html.fromstring(page.content) ...
"Extended" logs:
2020-07-24T05:50:48.163614+00:00 app[web.1]: Profile: {'LoginAndSignupPage': [{'captcha': {'enabled': False, 'key': ''}, 'gdpr_required': False, 'tos_version': 'row', 'username_hint': ''}]} 2020-07-24T05:50:48.164208+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - Profile: {'LoginAndSignupPage': [{'captcha': {'enabled': False, 'key': ''}, 'gdpr_required': False, 'tos_version': 'row', 'username_hint': ''}]} 2020-07-24T05:50:48.164209+00:00 app[web.1]: Traceback (most recent call last): 2020-07-24T05:50:48.164210+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media 2020-07-24T05:50:48.164210+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges'] 2020-07-24T05:50:48.164214+00:00 app[web.1]: KeyError: 'ProfilePage' 2020-07-24T05:50:48.164389+00:00 app[web.1]: django_instagram.templatetags.instagram_client - ERROR - path to profile media not found 2020-07-24T05:50:48.164390+00:00 app[web.1]: Traceback (most recent call last): 2020-07-24T05:50:48.164390+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/django_instagram/templatetags/instagram_client.py", line 34, in get_profile_media 2020-07-24T05:50:48.164391+00:00 app[web.1]: edges = profile['entry_data']['ProfilePage'][page]['graphql']['user']['edge_owner_to_timeline_media']['edges'] 2020-07-24T05:50:48.164394+00:00 app[web.1]: KeyError: 'ProfilePage'
I think the POST request sent with the auth param return a 400 response.
Thanks for your work @marcopompili . Let me know if more information is needed.
UPDATE
I found this explanation in a PHP Instagram scrapper Github gist. TLDR: Instagram "bans" IPs that make constant requests to the same URL. I'm not sure if it is right but it makes sense. Currently, I'm in a hurry so I decided to make the request on the front-end using js (example). Now it does retrieve the images in dev and prod env.
Maybe there is a way to make the request in the front-end (so the Instagram server gets our user IP and not the server IP for each request) and "receive" the response in the back-end so we can still use the handy templatetags and functionally of this Django app.
sir i don't know JavaScript can you help me use the script that you put in the UPDATE section in a right place
sir i don't know JavaScript can you help me use the script that you put in the UPDATE section in a right place
Yes I can, you have to add the code within a Githubissues.
Django Instagram was working, but now I receive the following error only in Production. In my development environment it works fine. It was working in production yesterday.