arouel / uadetector

UADetector is a library to identify over 190 different desktop and mobile browsers and 130 other User-Agents like feed readers, email clients and multimedia players. In addition, even more than 400 robots like BingBot, Googlebot or Yahoo Bot can be identified.
http://uadetector.sourceforge.net/
Apache License 2.0
246 stars 100 forks source link

How to report true browser, with IE compatibility? #64

Open gavdjones opened 10 years ago

gavdjones commented 10 years ago

Is there another API to get the true IE version? We have IE 9, 10, 11 being reported as IE 7 or 8 based on compatibility settings. Thanks. Using the latest jars.

arouel commented 10 years ago

We have currently no API to do that. When switching the compatibility settings I assume that the IE sent a different user agent string. Maybe you can test this and report what user agent string variants will be send. Maybe we can find something to differentiate them.

gavdjones commented 10 years ago

The setting I'm referring to is under Tools -> Compatibility View Settings -> (Add this Site OR if its Intranet)

With the setting off, my UA string is: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko With the setting on, my UA string is: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; InfoPath.3; MS-RTC LM 8)

As you can see, the UA string does tell the truth of you look at the Trident string (Trident 7 is IE 11). It seems that UADetector does not recognize this, favoring the 'compatible MSIE 7.0' instead.

See: http://msdn.microsoft.com/library/ms537503.aspx

eg.) Trident/5.0 The Trident token identifies the version of MSHTML (Trident) and can be used to determine whether or not the webpage is displayed in Compatibility View.

arouel commented 10 years ago

@gavdjones It seems to be solved within ticket #66, right?

gavdjones commented 10 years ago

Yes i think so!

We will test. thank you.

On Sun, Mar 9, 2014 at 3:20 PM, André Rouél notifications@github.comwrote:

@gavdjones https://github.com/gavdjones It seems to be solved within ticket #66 https://github.com/before/uadetector/issues/66, right?

Reply to this email directly or view it on GitHubhttps://github.com/before/uadetector/issues/64#issuecomment-37140033 .

scrawfor commented 10 years ago

This still seems broken to me. I'm using the 2014.03 version. Should it be working properly?

arouel commented 10 years ago

@scrawfor After you have commented here, I've tested it and you're right, it isn't solved.

But the question in the end is, do we really want to determine the compatibility setting or the true browser? The answer depends on the use case.

We would need some more informations from the underlying UAS database to be able to provide this via an API, so that the user decides in what he is interested in.

scrawfor commented 10 years ago

Totally agree that the answer depends on the use case. In my opinion, the major version should represent the true browser version, while there should be a separate field that represents the compatibility view version. In our case, we would be interested in both.

I'm unfamiliar with the structure of the product. What would be necessary from the UAS database? Is the value of the Trident token not exposed?

arouel commented 10 years ago

@scrawfor Exactly, the Trident token like any other Known Fragment is not exposed.

Some time ago I asked the maintainer of the UAS database (Jaroslav Mallat, @mallat) to provide also these information, because I saw them on his site (take a look here). Please ask him too. Maybe he is working on that. (From time to time he looks also into the issues of this project but I don't know if he watched this one.)

dimalinux commented 10 years ago

I think the default use case should be detection of the actual browser. JavaScript best practices, for many years now, have been to use object and feature detection instead of analyzing the user agent.

arouel commented 10 years ago

@dimalinux Good point, than we should think about to improve the current Regular Expression to regard the Trident fragment to determine the true IE browser, right?

scrawfor commented 10 years ago

@dimalinux Agreed, the default should certainly be the actual browser. However, there are use cases where it might be helpful to have compatibility mode listed as well.

We're doing some reporting on this for example.

gavdjones commented 10 years ago

Has there been any progress on this does anyone know?

I would agree that we need to be able to separate this. Compatibility is not the same as IE 7 either. Since user's have full control over whether they add your site to the compatibility list (or perhaps by a group policy) websites should be able to tell. Especially since compatibility can fix and/or create harmful issues.

In the meantime, i still have to harvest the raw headers from our database, using case statements like the following. Note that we block IE 7 so the assumption here is that any "MSIE 7.0" means compatibility for whatever browser the trident tag says. (this is not perfect, but more effective for our needs than the library at present)

CASE WHEN SIGN(INSTR ( (http_headers), 'MSIE 7.0')) = 1 THEN 'Y' ELSE 'N' END compat,
CASE WHEN SIGN(INSTR ( (http_headers), 'Trident/7')) = 1 THEN 'IE 11' WHEN SIGN(INSTR ( (http_headers), 'Trident/6')) = 1 THEN 'IE 10' WHEN SIGN(INSTR ( (http_headers), 'Trident/5')) = 1 THEN 'IE 9' WHEN SIGN(INSTR ( (http_headers), 'Trident/4')) = 1 THEN 'IE 8' WHEN SIGN(INSTR ( (http_headers), 'MSIE 7.0')) = 1 THEN 'IE 7' WHEN SIGN(INSTR ( (http_headers), 'MSIE 8.0')) = 1 THEN 'IE 8' WHEN SIGN(INSTR ( (http_headers), 'MSIE 9.0')) = 1 THEN 'IE 9' .... etc

jsr88f commented 9 years ago

Any update on getting visibility into Trident token ? I think this is a very important feature b/c X-UA-Compatible meta tag placed on a page or response header can switch off IE compatibility mode, but IE still sends 'incorrect' user-agent header. In other words browser in compatibility mode can be told to behave as its real self, but it still sends IE7-like user-agent header and UADetector has no way of catching that. Here is a MS link on the use of Trident header: https://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx

philsurette commented 9 years ago

I got here via a web search and would add my voice to the call for adding support to the UADetector API to allow detection of the real version of IE running, regardless of compatibility mode. Because of this missing feature we have been unable to adopt this excellent api.

arouel commented 9 years ago

@dgchtch, @philsurette I think the true IE version detection is something that should be separate from the underlying database that is in use. So it could be an extension of the current API which the use should explicitly call/use. Like a delegate on top of the existing parsers that work directly with the database. What do you think?

philsurette commented 9 years ago

Thanks for your quick reply! I took some time to look at the UADetector code so I could try to contribute intelligently to this conversation.

I think first we need some terminology. When IE 11 is running in compatibility mode emulating IE 7, then the browser 'version' reported by UADetector is IE 7 and this cannot change. We need a name for the 'real' IE version (11). I suggest using the term 'implementationVersion' to refer to the real version, although 'realVersion' or 'hostBrowserVersion' or somesuch would work.

I think the simplest (but maybe not the best) way to modify UADetector to support both versions would be to add a getImplementationVersion method to the ReadableUserAgent interface. The default implementation would be to return the same value for both, but for IE there could be different values.

If by delegating you mean that in the case of IE you would delegate the version parsing to an IE-specific class that knows to look for the Trident version number and convert this to an IE version number, then I think that would be a viable approach. And maybe the simplest. Another more general approach could be to add a new 'implementations_browser_reg' to the database contain the regexes required to identify the implementation browser info in the case of a browser running in compatibility mode. The 'implementation_browser_reg' elements would likely need to include a reference to a java class that would convert the user agent info correctly - for instance, an IEParser class would need to convert the Trident version number into the IE version number (I think the conversion is Trident number + 4).

A problem with this approach is that it works for IE's compatibility mode, where I think it is only the version number that needs to have two values. But other browsers may try to mimic different aspects of compatibility where they try to be compatible with different browsers, devices, whatever - there's lots of emulators. In fact that's why the user agent strings are all lies in the first place. So it might be better to leave ReadableUserAgent alone and allow end users to create two different kinds of parsers - one which behaves as now (returning the 'compatible' browser info) and new one which returns the 'implementation' browser info. The 'implementation' browser info could still come from the 'implementations_browser_info' element suggested above.

I have a feeling I have gone far deeper into discussing implementation details than I should have, these are all uniformed suggestions from an outsider! My knowledge of the user agent string machinations is not deep. From the perspective of a potential user, I don't care how UADetector exposes the 'implementation' version info as long as I can get it. UADetector looks like a great library and I hope a way of exposing this info can be found without compromising the API.

jsr88f commented 9 years ago

Not to get bogged down into IE-specifics like "Compatibility mode" I would simply add a new class called LayoutEngine. This bean would contain the name and version of the browser layout/rendering engine and could be accessed thru ReadableUserAgent. In case of IE11 layout engine name would be "Trident" and version is "7.0". Other browsers have different engines, eg Firefox uses Geko, Safari uses WebKit etc (see http://en.wikipedia.org/wiki/Web_browser_engine). It would be great if we could parse that info out of the User-Agent header somehow. Then the solution would be generic and would be sufficient to detect true/native version of the IE browser since there is a one-to-one mapping between Trident version and IE version.

philsurette commented 9 years ago

Layout engine sounds great.

arouel commented 9 years ago

Is somebody interested to do a Pull Request that prototypes the proposed changes?

nerdgore commented 9 years ago

Just to add my 2 cts to the discussion. I think @jsr88f has a very valid point. A lot of times you do have some logic that requires actions based on layout engine, rather than exact browser family. For example, right now it is very difficult with uadetector to whitelist all webkit browsers.