whizzzkid / instagram-proxy-api

CORS compliant API to access Instagram's public data
https://nishantarora.in/building-your-image-gallery-using-public-instagram-API.naml
261 stars 48 forks source link

{"error":"TypeError: Cannot read property 'user' of undefined"} #17

Closed waclock closed 6 years ago

waclock commented 6 years ago

If I try scraping more than 1 URL "fast" (No sleep/wait between calls), I get the error

{"error":"TypeError: Cannot read property 'user' of undefined"}

This is currently happening in the official's project URL, and in a custom server I set to scrape certain users. I believe instagram has some sort of "anti-spam" filter up working now, but I'm not really sure

whizzzkid commented 6 years ago

This service will deny scraping requests in the future, if it's working now it's only a temporary so people can try this service. I request you to scrape the official graphql urls from instagram. The aim of this project is only to provide cors complaint data to webpages which cannot load json from instagram's graphql urls.

BTW: https://igapi.ga/whizzzkid/media?count=3 works

waclock commented 6 years ago

I know https://igapi.ga/whizzzkid/media?count=3 works, but if you try spamming F5 5-10 times, the error will come up.

I don't really understand your last message. Does this mean the project is dead? Or that we should have personal servers ourselves to scrape data? My understanding is that this project uses instagram graphql, doesn't it? Otherwise I guess we'd have to start from scratch interacting with IG?

whizzzkid commented 6 years ago

First: Try this URL https://igpi.ga/graphql/query/?user_id=1606740656&count=3 does spamming it creates the same effect?

Second: if this works for you, please use the 1-click deploy and have your own heroku instance running in less than a minute. This way other users won't be affected while using this service.

waclock commented 6 years ago

Whiz, I can't seem to reproduce the error anymore in both igpi.ga and the private heroku server I have myself for this project (which I set using the 1-click deploy). I'm not sure what could've caused the issue.

whizzzkid commented 6 years ago

That was probably because of the server load. igpi.ga and igapi.ga are both running on free instances on heroku. It sometimes becomes unresponsive if there are too many requests. I hope the issue is now resolved :)

waclock commented 6 years ago

Whizzz, sorry for opening this again. I'm still facing this issue when scraping multiple profiles/doing calls very fast in my own server (take this into consideration, I'm not spamming your endpoint). Do you know why this might be? Is there some kind of limitation in either the project or in IG's API? I really don't want to start communicating from scratch with IG's API as I really like how this project works with pagination and limit.

whizzzkid commented 6 years ago

I don't believe there should be any kind of limits on ig's ends for this query. However, they might be limiting by IP address. The strange thing is, igpi.ga was responding with the same errors, restarting all dynos fixed it. I am not sure if it's bug in the code. I'll have a look at the code again, probably find someone to review it for me.

I am not sure how fast you're going that causes this issue, however as of now instead of using the simpler url like: https://igpi.ga/whizzzkid/media/?count=3 can you try using https://igpi.ga/graphql/query/?user_id=1606740656&count=3 and see if this happens again. It'll help isolate the issue.

waclock commented 6 years ago

I'm actually using the graphql/query endpoint and the "/username" to get the user's ID.

Right now if I try clicking on any of the links you specified in your last response, I get the same error. Weird thing is this didn't happen before instagram changed their API (before you had to do the workarounds, and that graphql/query appeared).

I'm doing the calls quite fast, try scraping whole instagram of some users, with a count of 50, with a 1 sec sleep between calls

whizzzkid commented 6 years ago

Ok so I added more error descriptions and limited the number of count calls. I would recommend, forking this proxy and changing the fetch count to something like 500 and see if it works for you.

waclock commented 6 years ago

Thanks Wizz, I'll give it a shot. Will let you know if I still face the same issues (which "should" mean IG might block based on IP?)