Open QihanWangCo opened 2 years ago
Same issue here. Had a working code since 05/07/22, has basically the same structure as yours, and it ran fine. Untill today, that is: now it breaks at the same line (I believe) - during the .get_items() in the for loop.
Also adding another part of the error that may have to do with the issue "_logger.warning(f'Page does not exist')".
[106][...]/Python310/lib/site-packages/snscrape/modules/instagram.py?line=105) def get_items(self):
--> [107][...]/Python310/lib/site-packages/snscrape/modules/instagram.py?line=106) r = self._initial_page()
[108][...]/Python310/lib/site-packages/snscrape/modules/instagram.py?line=107) if r.status_code == 404:
[109][...]/Python310/lib/site-packages/snscrape/modules/instagram.py?line=108) _logger.warning(f'Page does not exist')```
As the comment there suggests, this is due to changes on Instagram's side. They recently overhauled their site a bit. The scraper needs to be adapted to those changes.
As the comment there suggests, this is due to changes on Instagram's side. They recently overhauled their site a bit. The scraper needs to be adapted to those changes.
Thanks for your answer! Really looking forward to the adaption!!
Any updates on this yet? Curious if we can help somehow!
Any updates on this yet? Curious if we can help somehow!
If you're a programmer, you could send a fix via the "pull requests" feature (or just by suggesting a fix!).
Yeah, I know how GitHub works — just wanted to know whether there is any active development happening elsewhere on this particular issue.
Is this is a dead repo now?
No, but there hasn't been anything worth saying.
This issue, along with any other Instagram or Facebook issues, is effectively blocked by their silly rate limits. They make development of the corresponding scrapers very annoying since rapid testing is very tricky. I haven't had time to look into possible workarounds to make that less unpleasant and less time-intensive. So for now, those scrapers are unfortunately poorly supported by me. I'll happily consider PRs though.
Hey @JustAnotherArchivist, I'm trying to solve this issue. Can you share what we're looking for in the source code returned?
Is it a JSON link or plain JSON? Currently, there is no script with the type "text/javascript" returned by Instagram.
It would be great if you could share what was being stored in "jsonData" before this error came. Thanks!
@purut18 I don't recall the exact format etc., but it was basically some context information (profile, hashtag, location, etc.) and the first page of posts, I believe.
Well... nothing like that is being returned in the source code of Instagram now. (If someone else can confirm this, please?)
I think Instagram changed it or moved to dynamic rendering to prevent scrapping :/
I am working on a fix for Instagram. So far searching by user and hashtags are working. Location will be soon™️
In #1001?
logged out users for locations always returns a single page of data and there is a pretty strict rate limit on getting data from the platform. But data is returned, for now.
@0bmay i keep getting "IndexError: list index out of range" when trying to "for post in sns.InstagramHashtagScraper(query).get_items()" how could i resolve this? ;/
@feusagittaire The pull request hasn't been merged to snscrape yet
logged out users for locations always returns a single page of data and there is a pretty strict rate limit on getting data from the platform. But data is returned, for now.
Tysm for that! If I may ask, it will be implemented in any time soon?
@feusagittaire Until the pull request is merged, you should be able to do a pip install -U git+https://github.com/0bmay/snscrape@insta_fix
to install their copy of snscrape.
tysm for the tip!!
Hi, I want to use snscrape for collect instagram data. My code is:
And I got this error:
How can I fix it?