matomo-org / device-detector

The Universal Device Detection library will parse any User Agent and detect the browser, operating system, device used (desktop, tablet, mobile, tv, cars, console, etc.), brand and model.
http://devicedetector.net
GNU Lesser General Public License v3.0
2.87k stars 469 forks source link

Bot types #7489

Open Simbiat opened 10 months ago

Simbiat commented 10 months ago

Variable for bots $categories has some ambiguous types:

I am fine with creating PR to harmonize these things a bit, but I think this warrants a proper discussion first.

liviuconcioiu commented 10 months ago

https://github.com/matomo-org/device-detector/issues/5727

Simbiat commented 10 months ago

Hm, that one did not cover the questions above, in the end, although it did mention multiple feed bots, and it resulted in code for validating categories. I am, essentially, talking about cleaning up the types.

sgiehl commented 10 months ago

I guess we don't have a "clean" definition of categories to use. Feel free to create a PR to clean them up a bit.

Simbiat commented 10 months ago

I can add this to #7490. Or would a separate PR be better?

sgiehl commented 10 months ago

@Simbiat It's better to have a separate PR, as that makes reviewing easier.

liviuconcioiu commented 1 month ago

I've come across https://radar.cloudflare.com/traffic/verified-bots, which has a nice classification. Thoughts?

Simbiat commented 1 month ago

What that page suggests:

Personally this is what I would do:

So this would leave these categories:

I also tried thinking of some acronym, but best I and GPT came up with was SCAIS, because it can be pronounced "skies". Not like we need an acronym or need these specific names, of course. But I think they are a good balance between precise and generic.

Any update would require review of all the bots. I do hope, that by the end of year I will finish going through all brands (and submit PR to correct quite a few things there) and start working on bots, and when I do I can adjust their categories as well, of course.