open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
244 stars 158 forks source link

Define `cloud.platform` and/or rename it #609

Open lmolkova opened 8 months ago

lmolkova commented 8 months ago

https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/cloud.md document does not explain what the cloud platform is (cloud.platform attribute):

Assuming cloud services are instrumented themselves and report spans/metrics to end users, what should they populate? I.e. could AWS S3 or SQS be added to cloud.platform enum and send their telemetry with cloud.platform: aws_s3 attribute?

lmolkova commented 8 months ago

Based on the name and the existing enum, it looks like cloud.platform only makes sense in the context of client application running on a service. If it's the only intention, it should be documented. And if cloud services that don't really run user code are instrumented, they should rather report their name in the service.name.

This approach makes correlation based on attributes harder (e.g. if you want to get all spans that call foo or are reported by foo, you'd have to write something like spans | where "cloud.platform" == "foo" or "service.name" == "foo").

Also, it's hard to draw a clear line on what qualifies as a platform then.

Alternative approach could be to

E.g. HTTP server span emitted by azure storage would have cloud.service.name: azure_storage_blob resource attribute. It also can coexist well with service.name which would be something internal like frontend and reported to internal telemetry systems.

pyohannes commented 8 months ago

There have been some discussion around cloud.platform recently (https://github.com/open-telemetry/semantic-conventions/pull/344, especially this comment).

There was also some discussion whether it should be a resource-level attribute (as it is now), or whether it should be applicable to signals too.

joaopgrassi commented 8 months ago

As I commented on #344 on using cloud.service.name is people thinking that they need to report service.name in cloud.service.name when they run things on cloud services. Not sure that will even happen, but I can for sure see people being confused.

But I totally understand the use case of having both - where the software is running + the cloud service it is talking to. I guess we just need to find a name that better defines "the cloud service/product" that does not conflict/confuse with service.name.