chris-greening / instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
https://chris-greening.github.io/instascrape/
MIT License
630 stars 108 forks source link

Login to Instagram #9

Closed chris-greening closed 3 years ago

chris-greening commented 3 years ago

Is your feature request related to a problem? Please describe. Depending on usage, this library won't work as intended because Instagram seemingly checks cookies to make sure the user isn't a random bot. If using instascrape from a personal computer that has been logged into Instagram before, this doesn't seem to be a problem. Unfortunately, the library breaks if trying to use it from say a remote server that has never logged into Instagram before.

Describe the solution you'd like Provide a way to bypass these restrictions by either

Describe alternatives you've considered Considered using selenium to login but I really don't want to force people to install or use selenium, the whole purpose is supposed to be lightweight and selenium has too much overhead and is slow

Additional context N/A

MNISAR commented 3 years ago

I do not think one can bypass the 2 factor authentication. What we can do is that use email as 2-factor authentication and then add a simple code to extract code from email and then login for first time (we do not need to worry for next login). From this https://github.com/timgrossmann/InstaPy/blob/master/instapy/login_util.py you can see that they have made bypass_suspicious_login method to do so. It can be used as reference to do so.

chris-greening commented 3 years ago

@MNISAR, thanks for the info! Funny that you mention bypass_suspicious_login, I was literally just looking at instapy's login_util.py like two days ago for inspiration.

I'm apprehensive of using selenium for this project because I want to avoid the overhead of opening any browsers or needing any drivers. The original inspiration for this project was actually so that it could run automated on a remote server and not have to worry about hacking together some sort of virtual display or something to get selenium working.

I've caught wind that you might be able to POST request your way through though as recently as 2 months ago with requests (see here) but wasn't able to find success in the twenty minutes I spent tinkering with it lol but a lightweight solution or similar is definitely the preferred way of getting in if possible

Regarding the 2-factor authentication though, I figure that's a bridge we can cross when we get to it. I'm pretty sure Insta has 2-factor turned off by default and I just want to get in under normal circumstances

MNISAR commented 3 years ago

Good to hear that my comment was helpful! Hey also did you check the headless mode in selenium, check this link it says about the benefits of using headless mode in tools such as selenium. (reduces overhead) One more thing about "remembering login" is using --user-data-dir=chrome-data in selenium. It works as if a user is using the chrome and you do not have to login every time. This might reduces suspicious login attempts.

chris-greening commented 3 years ago

@MNISAR definitely a solid suggestion, instascrape is a bare bones adaptation of a dynamic scraper I wrote that was Selenium heavy for scraping dynamically rendered Instagram content; it isn't that much of a leap to integrate some optional PhantomJS or headless Chrome action to do that dynamic stuff requests just can't do

Certainly food for thought moving forward!

NivardoX commented 3 years ago

Hey @chris-greening, have you seen this approach? It had some recent updates. Perhaps is a good way of avoiding selenium and heavier tools.

chris-greening commented 3 years ago

Hey @chris-greening, have you seen this approach? It had some recent updates. Perhaps is a good way of avoiding selenium and heavier tools.

That's exactly the idea I was going for! Definitely looking to use requests.post if it's possible. I'll tinker around with this later today, thanks for the reference 😏

NivardoX commented 3 years ago

Hey @chris-greening, have you seen this approach? It had some recent updates. Perhaps is a good way of avoiding selenium and heavier tools.

That's exactly the idea I was going for! Definitely looking to use requests.post if it's possible. I'll tinker around with this later today, thanks for the reference 😏

Awesome, when you start coding this, tag me for help. I would love to help you w this. 😄