novitae / sterraxcyl

Instagram OSINT tool to export and analyse followers | following with their details
GNU General Public License v3.0
534 stars 59 forks source link

Rate Limit Error although followers extraction is completed #27

Open vitguld opened 1 year ago

vitguld commented 1 year ago

Hey,

I'm having an issue with resuming a "part" content, I'm using the following command syntax: sterra export -ssid xxx -u xxx -p xxx

When it hit 100% on the followers extraction, the "RateLimitError" showed, so I didn't use sterra for three days, (usually used it every 24h to finish the extraction phase) now when trying to resume the script with the syntax above - I'm still getting the "RateLimitError".

When using Instagram on mobile or web, I am able to use the search, like photos, follow etc... To make sure, I ran sterra again but without the "-p" flag using the following: sterra export -ssid xxx -u xxx -t followers and it started the process all over again, which meant it wasn't a rate-limit blockage.

Also, the "Getting users details" has been completed to 100% (then again rate limit).

AdamQuadmon commented 1 year ago

any luck with this? I just finished my first export and got the same error

AdamQuadmon commented 1 year ago

for now I just converted the json part file to csv to use the scraped data and this is enough for my needs

vitguld commented 1 year ago

@AdamQuadmon No luck, I tried to convert the JSON part file to CSV (using a convertor online), but it all comes out misaligned - mind sharing how did you manage to convert yours to a readable csv file?

AdamQuadmon commented 1 year ago

@vitguld I just used the first online service I found on google: convertcsv.com maybe you can open the file in an editor like VS code and check the json file for errors and missing brackets

mikeysan commented 1 year ago

Hey @AdamQuadmon ,

You mentioned in September that you were working on v3. Do you have an eta on that yet? I starred your project but haven't used it in a while.

Perhaps, I could test this issue when v3 is released.

Mikey

novitae commented 1 year ago

@mikeysan hi

The progress on the v3 has never been that slow ... It is also slowed down by the fact that some features I found during research on instagram cannot be open sourced, because of their severity and rarity, that would lead instagram to patch some, and also to be stolen by others. Also, adding to all of these, it decided that the next version will be mainly as an online service on a website, and since I never did frontend nor backend, I have to learn all of this, knowing that the only language I really know is python 🥲. So for now it is mainly researches done on instagram, no development.

This work is also the reason why I am not checking issues there anymore, since the new version won't use the same endpoints and techniques at all, so I don't have the time to dive back into the v2 envrionment (also the code is very trash, I don't know how it is still working "well" with such a bad code).

mikeysan commented 1 year ago

Thanks for the feedback :) I can appreciate the learning curve you're dealing with. I am learning Python at the moment and I feel it every time I look at a project and want to contribute. I think not yet; you are not there yet 😄

timothy-cloudopsguy commented 1 year ago

Just throwing this out there... It's not technically a RateLimitError happening. What i've experienced is that whether I set a delay for 1, 3, 5, 7, 10 seconds, I always run up against this RateLimitError at some point in the process around 100-200 followers.

However, when I actively click around in the browser window where I have pulled the sessionID, at least clicking and scrolling every 5 minutes or less, the RateLimitError doesn't hit. And it's then usually around the 400-500 followers retrieved where I end up getting bored and distracted and forget to click around in the web browser. And then the RateLimiteError happens again.

So, what I would suspect, is that you could add some other API commands in the "delay" where you fake scrolling or clicking into profiles or something.

What I really think is happening is that running the exact same command over and over and over again is the actual issue, but it's not a rate limit per say, it's just a "hey you appear to be a robot"... This tracks because on one of my accounts, when I click back into the web browser, it logs me out and makes me do a robot verification matching photos.

Edit 1: Also note that scrolling around in the app on your phone won't work to keep the RLE from happening, since it's not tied to the sessionID you're using in the sterra command.

Edit 2: I tried just letting the Reels in my friends list scroll infinitely, and it timed out. I really think it's a ~10 minute activity check. You've just gotta randomly scroll/click/search and it keeps it active longer. (I have almost 3000 followers, I couldn't imagine using this application/command for bigger accounts. The only reason I need the data is because my marketing company requested the list in an excel spreadsheet for a project. I cannot imagine what reasons are behind IG making it this difficult to grab your followers list)

Edit 3: I found this chrome extension that gives a bit deeper (not full) explanation of the different limits you can hit. I believe i've ran into all of them, which is super annoying. I just want to grab my followers list, why does this need to take 4 days to do it? lol

Link: https://igexport.converts.cc/ """ What are the limits of Instagram? Instagram (not IGFollow) limits the number of web requests in any given time period. Their limits are not exactly known, may change at any time, and will vary from account to account.

There are four main types of restrictions:

A 429 limit is very common. It usually takes between 1 and 15 minutes to clear. A 400 restriction is occasionally hit and usually requires a simple account verification to be completed and restored. A soft restriction is occasionally hit and usually lasts about 10 minutes, after which it can be restored. Hard restrictions occur less frequently but last longer, usually at least 12 hours, sometimes up to 48 hours. """

Cheers, Timothy