mitchellkrogza / nginx-ultimate-bad-bot-blocker

Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat Offenders
Other
3.81k stars 472 forks source link

[CCBot/2.0 (http://commoncrawl.org/faq/)] (An excellent nonprofit project that is blocked) #487

Closed rbjarnason closed 1 year ago

rbjarnason commented 1 year ago

Paste the full User-Agent String here

CCBot/2.0 (http://commoncrawl.org/faq/)

Is this for Addition / Removal?

Did the User-Agent request robots.txt first?

Additional information

Thank you for your excellent project that has been a great help to us, but we also noticed you blocked CCBot. CommonCrawl is an excellent nonprofit project that we use a much ourselves. CCBot respects both robots.txt and Crawl-Delay.

mitchellkrogza commented 1 year ago

Will not be removed unfortunately CCBot has been here since inception. You can happily and easily unblock it yourself in https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/blob/master/bots.d/blacklist-user-agents.conf but removing it will upset a LOT of users and there are thousands.

Also see

https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/issues/515

GitHub
nginx-ultimate-bad-bot-blocker/blacklist-user-agents.conf at master · mitchellkrogza/nginx-ultimate-bad-bot-blocker
Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail f...