Open rogercoll opened 1 month ago
Pinging code owners:
pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Hi @rogercoll, I took a quick look at this.
It looks like the UA parser provides a function to parse Os info from a user agent string.
In https://github.com/ua-parser/uap-core/blob/master/tests/test_os.yaml I found various user agent strings; however all expected test results consist of family
, major
, minor
, patch
and patch_minor
and not type
, name
, version
, build_id
and description
. So there is no 1:1 mapping between the two.
As a first iteration we could map:
user_agent.os.name
to Os.family
user_agent.os.version
to {Os.major}.{Os.minor}.{Os.patch}.{Os.patch_minor}
Maybe we could also set user_agent.os.type
by performing a lookup based on Os.family
. e.g., Android -> Linux, WatchOS -> iOS, etc.
What about the rest of the fields you proposed though?
@ioandr Thanks for taking a look into this. Based on your research, there are three attributes which we cannot map 1:1 with the UA package parser function. I would purpose the following:
user_agent.os.type
: Same as you shared, lookup map based on os.familiy
user_agent.os.build_id
: {Os.patch_minor}?user_agent.os.description
: The whole OS string included in the User Agent. For example: "Mozilla /5.0 (X11; Linux x86_64; rv:127.0) Gecko/20100101 Firefox/127.0"
→ X11; Linux x86_64; rv:127.0
Although I would not make the previous a blocker, if is not clear/feasible their extractions, I would start with the 1:1 mapping with the UA package.
Thanks for the follow-up @rogercoll, I will take a stab on this and open a PR shortly.
Hi @rogercoll I opened a PR that adds name
and version
as discussed above. I also updated existing test cases as needed.
For the time being I didn't add the extra fields for the reasons below:
type
: I couldn't find an exhaustive, trustworthy mapping to go from OS family to OS type. Let's tackle this in the next iterationbuild_id
: I am not sure mapping this to patch_minor
does not look accurate after searching on the internet. Build ID is mostly common for Windows (e.g. 22621
) and MacOS (e.g., 20B29
)description
: it seems that the UA parser does not provide a function to return the "original OS string". This probably requires some regex matching which might be tricky to get right for all user agent stringsOther than these, please let me know if I need to update any OTEL collector documentation, I couldn't find any relevant places other than the Semver documentation:
https://opentelemetry.io/docs/specs/semconv/attributes-registry/os/
Component(s)
pkg/ottl
Is your feature request related to a problem? Please describe.
UserAgent semantic convention attributes can be extracted using the OTTL UserAgent function: https://github.com/pchila/opentelemetry-collector-contrib/tree/7da12e47eb9cf719aa593f9935bce9ba72844703/pkg/ottl/ottlfuncs#useragent (implemented in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/34172)
The current extracted attributes are user_agent.name, user_agent.version and user_agent.original. But more information can be extracted from the
user_agent.original
string, like the OS related information.Semantic conventions proposal: https://github.com/open-telemetry/semantic-conventions/issues/1433 Current Elastic ECS user_agent OS attributes: https://www.elastic.co/guide/en/ecs/current/ecs-user_agent.html#_field_reuse_30
Describe the solution you'd like
Extract additional fields from the user_agent:
Describe alternatives you've considered
No response
Additional context
This functionally would be very helpful for logs/metrics analytics, for example, a Nginx Ingress Controller log record contains the user-agent, this function could be configured in the collector to extract the OS information from all Nginx logs. Dashboards and alerts can be built over this information; OS with most errors? Which are the most common OS versions? etc.