atmire / COUNTER-Robots

Official list of user agents that are regarded as robots/spiders by COUNTER
MIT License
64 stars 29 forks source link

"core" filter removes QQBrowser usage #56

Open jByrneSpringer opened 1 year ago

jByrneSpringer commented 1 year ago

QQBrowser is a browser used in China. An example user agent field will be "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3641.400 QQBrowser/10.4.3284.400"

Do you have a list of robots targeted by the "core" field?

mdio commented 1 year ago

In addition, the smartphone Crosscall Core X4 is marked as a crawler due to this:

Mozilla/5.0 (Linux; Android 10; Core-X4 Build/QKQ1.200407.002; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/113.0.5672.76 Mobile Safari/537.36

According to https://www.robotstxt.org/db/core.html this is/was a robot developed by Minho Univeristy in Portugal in 1995. I think it's safe to assume that it is no longer in existence, but I've messaged one of the authors to confirm. If it's still around I think the string "core" should at least be made more specific to avoid false positives.