chris-greening / instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
https://chris-greening.github.io/instascrape/
MIT License
630 stars 107 forks source link

profile_pic_url and profile_pic_url_hd give incorrect values when using session ID #82

Closed boompig closed 3 years ago

boompig commented 3 years ago

Code to Reproduce

Let's use Kim Kardashian's IG account as a good example. The session ID can be retrieved in the usual way.

from instascrape import Profile
user = Profile("kimkardashian")
headers = {
    "user-agent": USER_AGENT,
    "cookie": "sessionid=%s" % SESSION_ID
}
user.scrape(headers)
user.to_dict()

This yields correct data in all the fields I checked except the profile_pic_url and profile_pic_url_hd, where the URL sends me to my own profile picture (for my session ID). Possibly this is an IG anti-scraping technique?

Version

latest install as of this writing with python version 3.8

chris-greening commented 3 years ago

Hello!

Yeah, Instagram has been switching things up a lot the last month or so 😅

The bug is occurring because passing a sessionid emulates requesting from a logged in account and when you're logged in, the JSON they serve back is slightly different.

I just patched it to work regardless of sessionid and pushed it to PyPI, check v2.1.2! Thanks for reaching out!