JayBizzle / Crawler-Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
https://crawlerdetect.io
MIT License
2k stars 259 forks source link

AI bots #540

Open staabm opened 2 months ago

staabm commented 2 months ago

I wonder whether this lib could/would add support to detect AI bots, so crawlers which are used to feed AI engines. I found a repo which lists most of them: https://github.com/ai-robots-txt/ai.robots.txt

maybe thats something which could be used for/with crawler-detect?

JayBizzle commented 1 month ago

Hmmm, interesting suggestion.

Would you see our current isCrawler() method returning true for these AI bots, with a new additional method called something like isAiBot()?

Open to implementation suggestions here 👍

staabm commented 1 month ago

I think a separate method would be good, as I think some users might be interested in 'whatever' definition this lib had of a crawler before.

AI bots are a different class which might need different handling IMO

beberlei commented 1 month ago

A crawler is a crawler or not? Regardless of what the crawler actually does with the responses. If users want to differentiate then they should look at https://github.com/VolkswAIgen/VolkswAIgen - but for the purpose of this library I would think that a bot from OpenAI or Claude is detected as crawler.

module17 commented 3 days ago

Agreed @beberlei -- I don't see a use case where it'd be necessary to treat an "AI" bot differently. Certainly if that were the case, I'd imagine that is beyond the scope of this library.