open-telemetry / opentelemetry.io

The OpenTelemetry website and documentation
https://opentelemetry.io
Creative Commons Attribution 4.0 International
529 stars 1.15k forks source link

document the concept relations between OTEL and common backends, and how those backends are visualizing OTEL data #1768

Open svrnm opened 2 years ago

svrnm commented 2 years ago

Via this comment from @legendecas

I'd find it would be helpful to document the concept relations between OTEL and common backends, and how those backends are visualizing OTEL data. I've heard complaints about the confusion of misalignment of terms between OTEL and those backends. Though, this can be documented somewhere else.

SuryanarayanaPeri commented 1 year ago

@svrnm - Can you please provide more insights on what common backends are in your mind when you have created this issue ? Is this more related to Grafana / Kibana kind of visualization you are thinking of ?

svrnm commented 1 year ago

@svrnm - Can you please provide more insights on what common backends are in your mind when you have created this issue ? Is this more related to Grafana / Kibana kind of visualization you are thinking of ?

@SuryanarayanaPeri please have a look at the initial discussion linked via the comment. @legendecas can you elaborate in more detail what your thoughts have been?

legendecas commented 1 year ago

IIRC, the original thread is about resource attributes and their visualization in backends. The OTEL Resource is supported in both metrics and traces. However, not all backends like Prometheus support arbitrary resource attributes. Instead, a limited set of resource attributes are translated when exporting to Prometheus: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#resource-attributes. I think it would be good to document the differences here.

svrnm commented 1 year ago

thank you for clarification @legendecas , I hope that helps @SuryanarayanaPeri

Aaron-Ritter commented 3 months ago

I was looking for something similar today, basically I want to know what Vendor/Backend is supporting the OTEL standard to what amount.

e.g. https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md the spec matrix is extremely comprehensive if it comes to the technologies.

But if i want to understand what vendor really supports OTEL completely I only have a simple yes or no available in https://opentelemetry.io/ecosystem/vendors/ but fact is that there are a lot of them are "partially" depending how detailed you want to look at the specification. Some might, as a simple example, support traces but not metrics, some might support both but only certain implementations of metrics.

What would be great is a way of testing it, if i could wish for something, it would be a mockup collector i can point to a backend/vendor which generates data according to the standard and i could double check if its actually showing up and if how. Such a mockup collector could be used to officially verify too.

But it might be I am missing the point, does OTEL: yes mean, as soon it supports something standardized by OTEL it = yes or is there a minimum requirement for a backend to define it as yes, or should der be yes/no for sub sections of the standard?

svrnm commented 2 months ago

following up on this, going beyond "vendor X does OpenTelemetry somehow" is not an easy task, because it is basically a compliance program what you are asking for, which not only requires a technical piece (send telemetry in a certain format to a backend and then double check somehow (semi-automatically) if the data can be consumed by the end-user in a reasonable way), but also a non-technical part, where we talk about verification and re-certification of that compliance. There have been talks about that, but that is well beyond of scope for this particular issue.

A potential pre-cursor to that is that we allow vendors to self-attest that their opentelemetry support includes certain things or not, e.g. an easy start could be just the signals supported to ingest & to visualize. We would need to add a disclaimer that this is self-attested by the vendors and end users can report if vendors attested to support a signal they don't if this works well we could go beyond the basic signals and also think about more features being supported, but I do not think that a full spec-compliance-matrix is what we should be aiming for, especially since this matrix is not for backends, but for SDK implementations. So it boils more down to what would end-users like to see being supported in backends (metric instrument types, span events, exemplars, signal correlation, etc.)

cartermp commented 2 months ago

I think there's value in having a concepts page that describes a few things:

I agree with @svrnm though, we should not be in the business of auditing vendors and OSS tools for their level of OTel support.

Aaron-Ritter commented 2 months ago

@cartermp i think the very first point about defining what a backend is could clarify a lot of the following points.

From what i understand it refers at the moment in many places including e.g. components documentation as "the backend you send data to" https://opentelemetry.io/docs/concepts/components/.

Where https://opentelemetry.io/docs/specs/otel/vendors/ goes in to a bit more detail with Supports OpenTelemetry:

Supports OpenTelemetry “Supports OpenTelemetry” means the vendor must accept the output of the default SDK through one of two mechanisms:

Which from what i understand is a backend?

Which then gets us this list https://opentelemetry.io/ecosystem/vendors/.

And Native OTLP basically means to follow this https://opentelemetry.io/docs/specs/otlp/.

So what ever solution is capable of receiving OTLP data is a Vendor and equally a Backend?

But in that standard it defines as well all signal endpoints https://opentelemetry.io/docs/specs/otlp/#otlphttp-request so if a Vendor does not support these endpoints is it still Native OTLP because it supports part of the standard?

To your other points, do you mean by kinds of backend the type of OTEL signals https://opentelemetry.io/docs/concepts/signals/ they are able process?

Would you say that visualization capabilities are things like how Backends process it e.g. aggregate, correlate, display different type of metrics, display traces/spans in different views map, waterfall, filter possibilities, correlate the different signals with each other etc.? if so, how detailed should that be?

And by influence telemetry, do you mean how it manipulates and stores and possibly aggregates it?

svrnm commented 2 months ago

Wow, having a page that describes what a backend is, sounds so obvious that it hurts that we do not have this yet, so, yes, please!! thanks @cartermp for coming up with that 🚀

So what ever solution is capable of receiving OTLP data is a Vendor and equally a Backend?

Here it starts to get messy. Because the list of vendors that we have are not exclusively backends, there are some Cloud Service Providers (which kind of have backends these days, but didn't when they were added), there are some Observability Pipelines (like observiq), there are also some "databases", so they lack visualization

Aaron-Ritter commented 2 months ago

A potential pre-cursor to that is that we allow vendors to self-attest that their opentelemetry support includes certain things or not, e.g. an easy start could be just the signals supported to ingest & to visualize.

@svrnm how would you argue the point on the OTLP, if a vendor does not process all of the signals specified in the standard is that still OTLP yes? Or would such a refining ultimately as well change the overall OTLP support to no?

I think splitting vendors in to different capabilities e.g. first based on signals would already help to get a better overview, even if its self-attested.

Aaron-Ritter commented 2 months ago

Here it starts to get messy. Because the list of vendors that we have are not exclusively backends, there are some Cloud Service Providers (which kind of have backends these days, but didn't when they were added), there are some Observability Pipelines (like observiq), there are also some "databases", so they lack visualization

so instead of OTLP it would be the different signals in the different stages receiving / processing / storing / visualizing? just a possible guess of stages, these should be defined too?