cbschuld / Browser.php

A PHP Class to detect a user's Browser. This encapsulation provides a breakdown of the browser and the version of the browser using the browser's user-agent string. This is not a guaranteed solution but provides an overall accurate way to detect what browser a user is using.
https://chrisschuld.com/projects/browser-php-detecting-a-users-browser-from-php/
MIT License
580 stars 303 forks source link

Googlebot is shown as iPhone #29

Open sktnetwork opened 10 years ago

sktnetwork commented 10 years ago

We screwed up our internal analytics by this issue.

Sample Input UserAgent: "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Output: iPhone.

Expected Output: GoogleBot

Solution / Fix: The parsing priority for Googlebot (and other bots) must be high. Replace the function checkBrowsers with this:

protected function checkBrowsers()
{
    return (
        // well-known, well-used
        // Special Notes:
        // (1) Opera must be checked before FireFox due to the odd
        //     user agents used in some older versions of Opera
        // (2) WebTV is strapped onto Internet Explorer so we must
        //     check for WebTV before IE
        // (3) (deprecated) Galeon is based on Firefox and needs to be
        //     tested before Firefox is tested
        // (4) OmniWeb is based on Safari so OmniWeb check must occur
        //     before Safari
        // (5) Netscape 9+ is based on Firefox so Netscape checks
        //     before FireFox are necessary

        // common bots
        $this->checkBrowserGoogleBot() ||
        $this->checkBrowserMSNBot() ||
        $this->checkBrowserBingBot() ||
        $this->checkBrowserSlurp() ||

        $this->checkBrowserWebTv() ||
        $this->checkBrowserInternetExplorer() ||
        $this->checkBrowserOpera() ||
        $this->checkBrowserGaleon() ||
        $this->checkBrowserNetscapeNavigator9Plus() ||
        $this->checkBrowserFirefox() ||
        $this->checkBrowserChrome() ||
        $this->checkBrowserOmniWeb() ||

        // common mobile
        $this->checkBrowserAndroid() ||
        $this->checkBrowseriPad() ||
        $this->checkBrowseriPod() ||
        $this->checkBrowseriPhone() ||
        $this->checkBrowserBlackBerry() ||
        $this->checkBrowserNokia() ||

        // check for facebook external hit when loading URL
        $this->checkFacebookExternalHit() ||

        // WebKit base check (post mobile and others)
        $this->checkBrowserSafari() ||

        // everyone else
        $this->checkBrowserNetPositive() ||
        $this->checkBrowserFirebird() ||
        $this->checkBrowserKonqueror() ||
        $this->checkBrowserIcab() ||
        $this->checkBrowserPhoenix() ||
        $this->checkBrowserAmaya() ||
        $this->checkBrowserLynx() ||
        $this->checkBrowserShiretoko() ||
        $this->checkBrowserIceCat() ||
        $this->checkBrowserIceweasel() || 
        $this->checkBrowserW3CValidator() ||
        $this->checkBrowserMozilla() /* Mozilla is such an open standard that you must check it last */
    );
}
cbschuld commented 5 years ago

this is an interesting issue because google bot is using emulation here to look at it through different lenses... the question is do we care if it's the bot or do we care of it's iPhone; somewhat interested in the larger opinion

sktnetwork commented 5 years ago

We should put it as GoogleBot, as for any analytical purposes, it's GoogleBot.

cbschuld commented 5 years ago

@sktnetwork - I can follow that thinking; I'll make some adjustments for 1.9.4 ; thanks.

cbschuld commented 5 years ago

@sktnetwork - it will be in 1.9.5 - didn't make it into 1.94 - I ran out of time

Eraser3 commented 2 months ago

I guess this never made it into production. I used your script today and ran into the same problem when trying to detect Googlebot. Also, Googlebot uses different "names" these days. I see "GoogleOther" allot. For the complete list, see: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers