Closed cr101 closed 6 years ago
As you have correctly pointed out, the user agent can be easily spoofed. Nothing we can do about that.
We do check a few other headers other than the user agent to detect some bots i.e. GoogleBot sometimes spoofs the user agent but identifies itself in the form of the FROM header.
IP address checking would be the next step. We have explored the idea in the past, but decided it would be too hard to maintain.
First of all, thank you for your software.
A lot of bots tend to spoof user agents and some do it for legitimate reasons (i.e. they only want to crawl mobile content), while others simply don't want to be identified as bots. Even worse, some bots spoof legitimate/polite bot agents, such as the user agents of google, microsoft and other crawlers which are generally considered polite.
How reliable is detecting bots/crawlers/spiders via the user agent?