open-telemetry / semantic-conventions

Defines standards for generating consistent, accessible telemetry across a variety of domains
Apache License 2.0
218 stars 142 forks source link

Introduction of a Synthetic Attribute for Server Span Telemetry #1127

Open JacksonWeber opened 3 weeks ago

JacksonWeber commented 3 weeks ago

Area(s)

area:browser

Is your change request related to a problem? Please describe.

I would like to be able to identify telemetry created by synthetic sources such as bots or crawlers. This issue looks to work on defining conventions surrounding marking spans as originating from a synthetic source.

Describe the solution you'd like

I would like to introduce an attribute to HTTP server span semantic conventions, as well as metrics and logs that represents a low-cardinality string such as the below:

synthetic -> "not set" | "bot" | "synthetic test"

Where the synthetic attribute being set to "not set" represents telemetry that is not generated from a synthetic source. This convention will be helpful for scenarios where a user may want to mark telemetry generated from frequent synthetic tests or web crawlers separately from direct human engagement.

The determination of which of the three options a span falls into could be made by maintaining a list of known synthetic sources or allowing this decision to be user configurable.

Describe alternatives you've considered

While we could consider setting the synthetic attribute to a Boolean value, I believe the extra granularity of the low-cardinality string would be valuable.

Additional context

No response

MSNev commented 3 weeks ago

https://github.com/open-telemetry/opentelemetry-specification/issues/1634