JayBizzle / Crawler-Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
https://crawlerdetect.io
MIT License
1.96k stars 255 forks source link

Option to disable "exclusions" for single UA parsing #396

Closed Bilge closed 3 years ago

Bilge commented 4 years ago

As far as I can tell, the purpose of exclusions is for a speed gain observed when parsing user-agents in high volume. That's fine, if true, for that particular scenario. However, for the common use-case of simply testing a single user agent it is clear this would be just an unnecessary overhead which actually results in a speed decrease. For such use-cases it would be desirable to disable the exclusions step, including all parsing, compilation and any other comprehension of the exclusions file to achieve optimum performance for single UA testing.

JayBizzle commented 4 years ago

We actually use Exclusions to also avoid false positives. For example we strip out the string POWER BOT so that when we run the user agent against the crawlers regex, our generic bot regex doesn't flag it as a bot.

This is not to say we wouldn't look at making improvements if there were compelling performance gains to be had, but we'd need to see some benchmark figures that present evidence it would be worth the work 👍