Closed ChristianZX closed 3 years ago
snscrape is not a crawler. It is manually invoked and merely emulates you opening a specific search or profile page in a browser, scrolling to the bottom, and extracting the tweets. robots.txt does not apply.
But isn't
opening a specific .. profile page in a browser, scrolling to the bottom, and extracting
pretty much the definition of web crawling? Is it possible to find out which path snScrape is using?
Since when is opening Twitter in a browser and using a mouse wheel considered crawling now? Am I actually a robot and lied to all those captchas? O_o
Scraping is not (necessarily) crawling. robots.txt is for systems like search engines that recursively walk through and index a website; it tells them which parts of the site to avoid in such automated crawls. That's not something snscrape does.
Great tool. But is it respecting Twitters robots.txt?