nielsbasjes / yauaa

Yet Another UserAgent Analyzer
https://yauaa.basjes.nl
Apache License 2.0
774 stars 131 forks source link

Microsoft Outlook Pro #67

Closed mmorel-35 closed 6 years ago

mmorel-35 commented 7 years ago

I found a problem with this useragent. It is detecting a language from the 'Pro' but it seems to me that it means professionnal rather than provençal. And as it is Microsoft Outlook , why isn't there an 'Email Client' Agent Class ?


- test:
    input:
      user_agent_string: 'Microsoft Office/16.0 (Microsoft Outlook 16.0.8241; Pro)'
    expected:
      DeviceClass  : 'Unknown'
      DeviceName  : 'Unknown'
      DeviceBrand  : 'Unknown'
      OperatingSystemClass  : 'Unknown'
      OperatingSystemName  : 'Unknown'
      OperatingSystemVersion  : '??'
      LayoutEngineClass  : 'Unknown'
      LayoutEngineName  : 'Unknown'
      LayoutEngineVersion  : '??'
      LayoutEngineVersionMajor  : '??'
      AgentClass  : 'Special'
      AgentName  : 'Microsoft Office'
      AgentVersion  : '16.0'
      AgentVersionMajor  : '16'
      AgentNameVersion  : 'Microsoft Office 16.0'
      AgentNameVersionMajor  : 'Microsoft Office 16'
      AgentLanguage  : 'Old Provençal (to 1500)'
      AgentLanguageCode  : 'pro'
nielsbasjes commented 7 years ago

Thanks. I immediately fixed the 'pro' problem. The reason for not having an email client class is two fold:

  1. In the usecase where I am (webtraffic) the percentage of email client traffic is so low that it doesn't show up in any significant numbers in the examples I have. This means that for me these clients are effectively hidden from me. I simply put don't have clear list of examples to work with.
  2. The cases I have seen are all very specific (like the one you showed here) and so far I can only see that detecting these would require an explicit list of useragents/patterns and would like to keep lists like that to a minimum.
mmorel-35 commented 7 years ago

I have contributed to browscap where they are creating files with regex to analyse uas. There is a fine detail on the browser type. But it needs to be updated every time there are new versions. Your approach is flexible enough not to need to list those versions.They have a great number of unit test on uas. Concerning the number of files, what is the problem with it ? Is it because it's increasing the size of your jar ?

nielsbasjes commented 7 years ago

My concern is not with the number of files but with the code maintenance effort needed to make it work in the future. So I choose to go for patterns (that I hope will work with future devices too) instead of exhaustive lists of devices. As an example: I do not have a list of all Samsung devices, I simply say that if the name of the device starts with "GT-" then it is a Samsung device.

BTW: If you can point me to a list of UAS that are email clients then I think it would be a good idea to detect those too.

mmorel-35 commented 7 years ago

You might want to take a look on udger there is a list of different kind of user-agents and Email clients you can find there. Also browscap is unit testing some examples of them.

nielsbasjes commented 6 years ago

Added Email as a new AgentClass I'll add more test cases in the next few weeks.

nielsbasjes commented 6 years ago

I have added a few more cases. Since this is a rather low traffic corner I'm leaving the rest to new issue reports.