mitchellkrogza / nginx-ultimate-bad-bot-blocker

Nginx Block Bad Bots, Spam Referrer Blocker, Vulnerability Scanners, User-Agents, Malware, Adware, Ransomware, Malicious Sites, with anti-DDOS, Wordpress Theme Detector Blocking and Fail2Ban Jail for Repeat Offenders
Other
3.81k stars 472 forks source link

GPTBot OpenAI's new web crawler #530

Closed robwent closed 10 months ago

robwent commented 10 months ago

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)

Is this for Addition / Removal?

Did the User-Agent request robots.txt first?

Post Log Excerpt to show User-Agent behavior (10-20 lines is enough)


40.83.2.64 - - [04/Aug/2023:09:25:54 -0400] "GET /little-free-pantries-spreading-compassion-goodies-communities/ HTTP/1.1" 301 302 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.72 - - [07/Aug/2023:18:02:07 +0000] "GET /yachts-for-sale/pass-the-hatt-254119/ HTTP/1.1" 200 49329 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.71 - - [07/Aug/2023:18:02:18 +0000] "GET /yachts-for-charter/xoxo-9969/ HTTP/1.1" 200 54171 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.79 - - [07/Aug/2023:18:02:29 +0000] "GET /yachts-for-sale/yachtdetails/235658/1958/sangermani/-/75-7/ HTTP/1.1" 200 49330 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.68 - - [07/Aug/2023:18:02:41 +0000] "GET /yachts-for-sale/four-j-s-270071/ HTTP/1.1" 499 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.78 - - [07/Aug/2023:18:02:51 +0000] "GET /yachts-for-sale/crystal-anne-244666/ HTTP/1.1" 200 49330 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.67 - - [07/Aug/2023:18:03:02 +0000] "GET /yachts-for-sale/johnson-83-flybridge-w-fishing-cockpit-242841/ HTTP/1.1" 499 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.72 - - [07/Aug/2023:18:03:21 +0000] "GET /yachts-for-sale/stella-268599/ HTTP/1.1" 200 49330 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.65 - - [07/Aug/2023:18:03:32 +0000] "GET /yachts-for-sale/somewhere-i-belong-254947/%23elementor-action%3Aaction%3Dpopup%3Aopen%26settings%3DeyJpZCI6IjIzNzEiLCJ0b2dnbGUiOmZhbHNlfQ%3D%3D/ HTTP/1.1" 200 49330 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.66 - - [07/Aug/2023:18:03:43 +0000] "GET /yachts-for-sale/sindbad-269328/ HTTP/1.1" 200 49330 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  
40.83.2.66 - - [07/Aug/2023:18:03:53 +0000] "GET /yachts-for-charter/team-factory-10696/ HTTP/1.1" 200 54155 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"  

Additional information

Official docs here IP ranges here

mitchellkrogza commented 10 months ago

Thanks @robwent this one will quickly find itself on the blocked bots list. I already block a lot of AI crawlers so this one should be no exception.