vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.13k stars 1.6k forks source link

Support ingesting OpenTelemetry traces #17307

Open spencergilbert opened 1 year ago

spencergilbert commented 1 year ago

RFC 11851 - 2022-03-15 - Opentelemetry traces source

spencergilbert commented 1 year ago

๐Ÿ‘‹ @KFearsoff,

As you pointed out here, we could definitely run into issues when needing to emit correctly structured traces in sinks.

Currently we don't really have this problems because the only supported integrations are with Datadog traces, so as long as they're not mutated by a user into an invalid shape the end-to-end flow of traces works fine (given we just model traces as logs internally).

Our thinking at the time of original implementation was that we weren't sure what requirements would be needed by our trace model and intentionally left it freeform. We intended to determine what was necessary to act as a "format agnostic" middleman and being to encode that in a purpose built container.

I think it's definitely possible that we/you could add trace support to the opentelemetry source today, but we'd need to make sure that if you routed opentelemetry -> datadog_traces it worked seamlessly - which of course could prove challenging or even impossible given the unknowns there.

Updating the TraceEvent without doing some discovery work implementing trace support in the opentelemetry source doesn't feel like the right direction to me personally. I think the best course of action would be doing some rough, spike level implementation in the source and see what we can do to support the datadog_traces sink as the destination. Taking note of what's working/not working/painful/etc and not necessarily expecting this initial work to be merged (though I don't see why that wouldn't be possible if everything just works out).

If you're still interesting in doing this more experimental and exploratory work we'd be happy to collaborate with you - but since the scope and plan has somewhat changed I understand if it's less appealing.

david-hodgson-at-sky commented 1 year ago

Can I add my support for this issue, and add a use case: We are using a suite of Vectors as (aggregating & filtering) proxies between the secure cloud systems and our Observability platform (built using Grafana suite). This gives us a limited and manageable set of places where we have security tokens for auth&auth with the Observability platform.

Works for Metrics using Vector, doesn't work with Traces as the needed source is missing in Vector. :-(

The need is for a simple, fast OTLP trace source and sink, with limited management (filtering based on metadata, augmentation & manipulation of metadata, sampling) of the traces / spans.

fzyzcjy commented 1 year ago

Hi, is there any updates, and I wonder when will ingesting opentelemtry traces be implemented? Thanks!

My use case is that, I am interested in ingesting OpenTelemetry data into clickhouse. I mainly followed the tutorial https://clickhouse.com/blog/storing-traces-and-spans-open-telemetry-in-clickhouse. However, given that I already deployed Vector following https://clickhouse.com/blog/storing-log-data-in-clickhouse-fluent-bit-vector-open-telemetry, I hope to continue using Vector (and its helpful clickhouse output plugin!).

jszwedko commented 1 year ago

Hi, is there any updates, and I wonder when will ingesting opentelemtry traces be implemented? Thanks!

My use case is that, I am interested in ingesting OpenTelemetry data into clickhouse. I mainly followed the tutorial https://clickhouse.com/blog/storing-traces-and-spans-open-telemetry-in-clickhouse. However, given that I already deployed Vector following https://clickhouse.com/blog/storing-log-data-in-clickhouse-fluent-bit-vector-open-telemetry, I hope to continue using Vector (and its helpful clickhouse output plugin!).

Hey! No updates yet on our end unfortunately. We would be open to shepherding a contribution here though.

caibirdme commented 10 months ago

Are there anyone working on this feature? If no, I'd like to invest some time on it

gaby commented 8 months ago

@jszwedko Any updates on this and Metrics support?

Having Vector only support one type means folks have to run multiple applications instead of just Vector

davinkevin commented 8 months ago

๐Ÿ˜‡ It's the main reason (Opentelemetry support, not just traces) why we haven't selected Vector in our stackโ€ฆ unfortunately (the alternative has higher memory and cpu requirement)

rektide commented 8 months ago

I think it's definitely possible that we/you could add trace support to the opentelemetry source today, but we'd need to make sure that if you routed opentelemetry -> datadog_traces it worked seamlessly - which of course could prove challenging or even impossible given the unknowns there.

What are some of the known unknowns here? What makes this a hard problem?

jszwedko commented 8 months ago

I think it's definitely possible that we/you could add trace support to the opentelemetry source today, but we'd need to make sure that if you routed opentelemetry -> datadog_traces it worked seamlessly - which of course could prove challenging or even impossible given the unknowns there.

What are some of the known unknowns here? What makes this a hard problem?

The big issue here is that Vector doesn't really have a trace data model yet. We have some preliminary support for traces from the Datadog Agent source that can be routed to the Datadog Traces sink, but there are a lot of assumptions about the data format that wouldn't hold for OTLP traces. I think what we would need to do is adopt a format (like we have for logs and metrics) and then map OTLP and Datadog Agent ingested traces to that format (the format could be OTLP) where each sink could map it from the internal format to the format expected by the destination.

hdost commented 8 months ago

Are there any counter proposals to the tracing data model? Could this also be something that could be handled with some transform functions? Perhaps it could be "based" on Otel and I'd there's alterations there must needs to be a mapping for how those alterations would be mapped. An asynchronous process could be done to see about getting potential alterations into the OTel

jszwedko commented 8 months ago

OTLP could indeed be the model for traces. I think we'd just want to see an RFC around it that compares with any potential alternatives. I'm not aware of any.

hdost commented 7 months ago

@jszwedko in fact I started, #20170, but looking at https://github.com/vectordotdev/vector/blob/master/rfcs/2022-03-15-11851-ingest-opentelemetry-traces.md it seems like the language states that OpenTelemetry would be used. This was in fact a "OpenTelemetry" source RFC. So I don't actually know the RFC I've created is necessary if only to make it specifically official.

1puls2is3 commented 4 months ago

Beginning in 2019, this need has been waiting for 5 years Wish it had been realized sooner. Thank you to all those who have worked on this, thank you! ๐Ÿ™

NasAmin commented 3 months ago

Is source: opentelemetry and sink: datadog_traces supposed to work?

I am trying to use this but don't see any traces in Datadog. I don't see any errors/warnings in Vector logs either.

jszwedko commented 3 months ago

Is source: opentelemetry and sink: datadog_traces supposed to work?

I am trying to use this but don't see any traces in Datadog. I don't see any errors/warnings in Vector logs either.

I think it can but you have to handle all of the mapping yourself. That is: map the OTLP trace to a format that Datadog understands. Trace support in Vector is pretty rudimentary. Ideally, the datadog_traces sink would handle that mapping for you (similar to how the Datadog OpenTelemetry Exporter handles mapping traces).

NasAmin commented 3 months ago

Ah so what would be the recommended approach for opentelemetry collector?

I'd rather wait for the datadog traces to handle the mappings.

In the meantime,use opentelemetry's datadog exporter from the the otel collector and not use vector at all?

jszwedko commented 3 months ago

In the meantime,use opentelemetry's datadog exporter from the the otel collector and not use vector at all?

This would be my suggestion for now.

NasAmin commented 3 months ago

@jszwedko One more question, does the Vector unit testing framework support testing traces?

jszwedko commented 3 months ago

@jszwedko One more question, does the Vector unit testing framework support testing traces?

Unfortunately, it looks like not yet. I'm only seeing support for logs and metrics: https://github.com/vectordotdev/vector/blob/210ff0925d391213556f07bf6ce621967f0368ca/src/config/mod.rs#L518-L526

scila1996 commented 2 months ago

I'm expecting an Opentelemetry codec with other sources, ex Kafka source. Many companies that use Opentelemetry to collect log/trace and output to Kafka (OTLP format) use opentelemetry kafka exporter to handle large volumes of data. I'm using vector-source Opentelemetry, but I still need a sidecar application like Opentelemetry Collector to receive data from Kafka and push it to Vector source OTLP -> sink Elasticsearch. So if Vector can support direct-source Kafka with codec OTLP, it will be a great feature for DevOps :))