atmire / COUNTER-Robots

Official list of user agents that are regarded as robots/spiders by COUNTER
MIT License
64 stars 29 forks source link

User agents with "User-Agent" in their string #27

Closed alanorth closed 4 years ago

alanorth commented 5 years ago

I have thousands of hits in my server logs from clients that have the literal string "User-Agent" in their HTTP User-Agent protocol header (sometimes even twice!). For example:

I don't know about other protocols, but in HTTP there's no reputable client that does this. As far as I can tell these are bots pretending to be real user agents and doing it poorly. In my case, the IPs associated with these are all on Amazon AWS and the requests are clearly not from a human user.

Should we add User-Agent: to the list of robots? Let me know what you think.

davidatmire commented 4 years ago

Hi @alanorth , We added ^User-Agent today together with other changes in this commit : https://github.com/atmire/COUNTER-Robots/commit/56cca8f13337e7acb89fdd0270f023ed9919b6f2

Thanks for your input!

alanorth commented 4 years ago

Great! Yes you're right, it's better to anchor it as ^User-Agent. Cheers!