rugantio / fbcrawl

A Facebook crawler
Apache License 2.0
661 stars 229 forks source link

Getting banned after crawling #12

Closed ansereb closed 5 years ago

ansereb commented 5 years ago

Excuse me, the issue is not about crawler itself, but about how to use it. I am creating new facebook accounts and after couple of crawls FB starts showing "We've recently noticed unusual activity from your account....". I am out of phone numbers and don't want to send a photo of ID. Is there a way to using this crawl and not getting banned? I tried to add some DOWNLOAD_DELAY to setting.py without success. Also tried to disable cookies by COOKIES_ENABLED = False but crawler does not work with it.

Thanks.

vecna commented 5 years ago

Hi, just to understand the dimension here: how long your crawling last? (days or hours?) how many post you collected?

ansereb commented 5 years ago

Hi, just to understand the dimension here: how long your crawling last? (days or hours?) how many post you collected?

It was just 2 or 3 crawls of comments from post from example on main page of this repository. Maybe because of 2018 facebook drama they are watching really hard for new accounts. Does anyone have same issue with new account?

ansereb commented 5 years ago

An update: it not only affects new accounts, but also old ones with groups, friends and stuff

rugantio commented 5 years ago

Hey @Brain2998, thx for the heads up. Surely facebook has automatic methods to monitor new accounts, and they have become more stringent after the CA scandal, almost all of them require some sort of human interaction to unblock (if it's even possible to), there is really not much we can do about it codeside. Trying to confuse fb by changing user agent and download delay, concurrent requests, are all good tries (did you try changing IP with a VPN?), but if you are already on a blacklist none of them might work, as you noticed. This behavior has never occured to me with old accounts with real friends, they aren't usually closely watched, that is something that we should really look into. It might be that your fingerprint was blacklisted from the previous crawl and that it ultimately lead to the real account being screened as well. If you find any workaround or want to add additional infos I encourage you to write up!