JayBizzle / Crawler-Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
https://crawlerdetect.io
MIT License
2.01k stars 258 forks source link

False positive detection of Yandex mobile browser as a bot due to HTTP_SEC_CH_UA header #541

Closed IlyaZholobov closed 1 week ago

IlyaZholobov commented 1 month ago

Hi, since the latest patch (version 1.2.120) of crawler-detect, I've encountered an issue where the Yandex mobile browser is being falsely detected as a bot. This seems to be due to the addition of the HTTP_SEC_CH_UA header in the detection process.

In the Yandex mobile browser, the HTTP_SEC_CH_UA header includes the following string: "Not/A)Brand";v="8", "Chromium";v="126", "Yandex";v="24". The string "Yandex" triggers the bot detection, but this is a legitimate browser, not a bot.

Could this detection rule be adjusted in the next patch, or is there a way to configure it to avoid this false positive?

module17 commented 1 month ago

Hi, since the latest patch (version 1.2.120) of crawler-detect, I've encountered an issue where the Yandex mobile browser is being falsely detected as a bot. This seems to be due to the addition of the HTTP_SEC_CH_UA header in the detection process.

In the Yandex mobile browser, the HTTP_SEC_CH_UA header includes the following string: "Not/A)Brand";v="8", "Chromium";v="126", "Yandex";v="24". The string "Yandex" triggers the bot detection, but this is a legitimate browser, not a bot.

Could this detection rule be adjusted in the next patch, or is there a way to configure it to avoid this false positive?

For these requests could you provide the other HTTP_* headers? Importantly the HTTP_USER_AGENT value.

It looks like this could be addressed already with https://github.com/JayBizzle/Crawler-Detect/pull/542

IlyaZholobov commented 1 week ago

UserAgent Example: Mozilla/5.0 (Linux; arm_64; Android 14; 22071212AG) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.6478.61 YaBrowser/24.7.9.61.00 SA/3 Mobile Safari/537.36 "Not/A)Brand";v="8", "Chromium";v="126", "Yandex";v="24"

HTTP_SEC_CH_UA : "Not/A)Brand";v="8", "Chromium";v="126", "Yandex";v="24"

IlyaZholobov commented 1 week ago

@module17

https://github.com/JayBizzle/Crawler-Detect/pull/542 - yes, this solves the problem.