arouel / uadetector

UADetector is a library to identify over 190 different desktop and mobile browsers and 130 other User-Agents like feed readers, email clients and multimedia players. In addition, even more than 400 robots like BingBot, Googlebot or Yahoo Bot can be identified.
http://uadetector.sourceforge.net/
Apache License 2.0
246 stars 100 forks source link

Google Mobile Friendly user agent not detected #105

Open PlusMinusNull opened 9 years ago

PlusMinusNull commented 9 years ago

google offers a mobile test inside the webaster tools here:

https://www.google.com/webmasters/tools/mobile-friendly/

the used agent string is not detected as a mobile client

"Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

jugglingcats commented 9 years ago

This issue is a concern to me too. It seems the cut for robots in the code is not inclusive enough of these cases where the 'Googlebot' string is buried later in the UA string.

dimalinux commented 9 years ago

It is a robot. If the name field of the parsed user agent is "Googlebot-Mobile" and you do a reverse IP lookup\ of the client and it is, in-fact, Google, you can special case your code to return your mobile website.

\ While a reverse IP lookup could still be forged by the owner of the IP space (verses a reverse and forward lookup that can't), the reverse lookup alone will weed out almost all fake google bots if response time is of the critical. Keep in mind that most bots claiming to be google are not really google.

jugglingcats commented 9 years ago

I found that the Google adwords mobile bot was not detected as a robot. I will post some sample code.

I don't think many of us are doing reverse IP lookups due to the performance overhead. For us it's more important to classify roughly than have it 100% accurate. We know some people will fake the UA. Besides if UADetector doesn't indicate a robot we wouldn't do a reverse lookup anyway.

dimalinux commented 9 years ago

@jugglingcats, @PlusMinusNull started the thread and he was having the reverse issue. i.e. He wanted Google's bot, which tests/index sites as if it is a mobile client, to get classified as a mobile client, but it's getting classified as a bot.

I was only recommending the reverse lookup for the exceptional case when you want to treat a bot as a real user. In that case you could potentially conserve resources, as only a fraction of the bots claiming to be Google are actually Google, and you wouldn't serve up your site to the fake ones.

jugglingcats commented 9 years ago

My bad, I misread the original post. I will do some more testing and raise a separate issue for the Google adwords bot detection. Thanks.

thorsten72 commented 9 years ago

I have the same problem as the OP: I want my application to treat the Googlebot-Mobile like a mobile client (and redirect it to the mobile page). Right now, I use uadetector for checking if the user agent is categorized into: ReadableDeviceCategory.Category.SMARTPHONE - but the mobile google robot is categorized from uadetector (correctly) as 'OTHER'

When debugging, I found that there is a field ReadableUserAgent.name which contains the value Googlebot-Mobile

Do you think it's an acceptable way to check the value of this field for the concrete val Googlebot-Mobile?

godenji commented 9 years ago

+1 to a resolution for this, added the obvious fallback (parse lower case user-agent string for "mobile") in case UA not detected, which fixes OP referenced version of Googlebot seeing our mobile site as desktop

phjardas commented 9 years ago

We're facing a similar issue. Funny enough, the user agent

Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

is recognized as Robot while the user agent

Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

is classified as Mobile Browser.

Contrary to the OP I would expect both to be classified as Robot.

jugglingcats commented 9 years ago

I agree with @phjardas for our use case (robot detection) we would like both to be classified as Robot, but can see that the opposite view is equally valid, eg. for site adaptation for mobile browsers.

Would it be an idea to separate robot into a separate field/flag, because increasingly robots are presenting themselves with different user agents specifically with the aim of interpreting sites from the perspective of different devices.

phjardas commented 9 years ago

+1 for the separation of bot-ness into a separate boolean flag.