JustAnotherArchivist / snscrape

A social networking service scraper in Python
GNU General Public License v3.0
4.33k stars 699 forks source link

Snscrape doesn't see tweets from accounts marked as sensitive #151

Closed IPockAUsername closed 3 years ago

IPockAUsername commented 3 years ago

I want to scrape tweets from accounts marked as sensitive, but snscrape doesn't returns any tweets from such accounts, while it has no issue with normal accounts.

I don't completely understand well how snscrape works, but I think it should be easy to fix by adjusting twitter settings of the account, that is used for scraping by snscrape, so that it can see sensitive tweets.

JustAnotherArchivist commented 3 years ago

Do you have an example of such an account? Is it possible to find the account's tweets through Twitter's search? snscrape does not use an account for the scraping; it merely emulates what you'd get in a browser when using Twitter's search without logging in.

JustAnotherArchivist commented 3 years ago

See also: #65 and #4

IPockAUsername commented 3 years ago

Do you have an example of such an account?

Sure, any porn account will do. For example @6blogger.

Is it possible to find the account's tweets through Twitter's search?

If you search twitter with "from:@6blogger" while logged out, you get nothing. If you search this query, while logged in into an account that allows sensitive media, you get the tweets.

Snscrape works through Twitter API, ins't it? Can this setting on the twitter account that was granted access to API influence it?

JustAnotherArchivist commented 3 years ago

It does not, as explained above. If there is no way to get these tweets from the search without logging in, there's nothing snscrape can do about this I'm afraid.

IPockAUsername commented 3 years ago

Too bad... There is indeed no way of getting those tweets without logging in.

JustAnotherArchivist commented 3 years ago

The twitter-profile scraper should work on these accounts (verified for 6blogger), but it will only be able to retrieve what's shown on the profile page, i.e. the most recent ca. 3200 tweets.