vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.13k stars 1.6k forks source link

New `opentelemetry` source and sink #1444

Open binarylogic opened 4 years ago

binarylogic commented 4 years ago

OpenTelemetry is a specification for collecting observability data.

Their collector and libraries are of questionable quality. We'd like Vector to support OT through their various protobufs and become the best OT collector.

We should break this down into smaller tasks, likely around their various data type (logs, metrics, and traces). I'd like to start with tracing, if possible, to introduce that type into our data model.

loony-bean commented 4 years ago

Looks relevant #576

szibis commented 4 years ago

As OpenTracing merged with OpenCensus in one OpenTelemetry project it is considered to support that feature on in and / out ?? It is important to make a Vector replacement for Datadog on the Tracing layer. https://www.datadoghq.com/blog/opentelemetry-instrumentation/ and may also help to build one layer for logs, metrics, and traces. This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.

binarylogic commented 4 years ago

Thanks @szibis. Agree, that's the idea with thee OpenTelemetry components. We also want them to enforce our tracing data model when we start to implement it.

This also may help to build an architecture that is not vendor locked and allow to switch to other providers easily.

Agree! That's the primary idea behind Vector. Although, Vector wants to acknowledge current state and help users migrate towards open standards.

LukeMathWalker commented 3 years ago

Hi! We have been using Vector for a while as a log exporter (stdout -> AWS Kinesis -> ElasticSearch) while we have a separate pipeline for OpenTelemetry traces (application pushes to a Jaeger collector). We were considering the option of moving to Honeycomb for our observability needs and I noticed Vector provides a honeycomb sink, but it does not provide any OpenTelemetry source. Would we therefore be losing information by using stdout as source and honeycomb as sink compared to pushing our OpenTelemetry data directly into the OpenTelemetry collector and using that to push the data into Honeycomb?

I'd prefer to have a single agent for all our telemetry needs, but it'd be interesting to understand better :eyes:

kaarolch commented 3 years ago

Any update? I see that tasks related with opentelemetry were removed from different milestones?

jszwedko commented 3 years ago

@kaarolch we are also currently planning on adding OpenTelemetry support to Vector by the end of the year. This should be more definite soon.

xdatcloud commented 2 years ago

@kaarolch we are also currently planning on adding OpenTelemetry support to Vector by the end of the year. This should be more definite soon.

That sounds great! I found this work will start soon at next month - Vector Public Roadmap. I would like to see details of design about opentelemetry source and sink.

jszwedko commented 2 years ago

Sure thing. We'll likely be posting RFCs for this work before any work starts.

ovidiubuligan commented 2 years ago

@binarylogic since 2 years has passed, Do you still thing otel collector is of questionable quality ?. I actually see it as much more flexible than vector , for example sampling strategy .

jszwedko commented 2 years ago

This is still on our near term roadmap. We didn't get to it this quarter like we expected, but anticipate working on it in Q1.

ghost commented 2 years ago

Is this something that you'd be willing to accept open source contributions for? This is fairly major work and in theory close to being worked on, so I'd understand if you said no. I was personally considering setting up an opentelemetry sink for metrics.

xdatcloud commented 2 years ago

I found the Tracing support RFC hasn't complete yet, willing to join discussion about this work.

jszwedko commented 2 years ago

This is something a team member is planning to pick up this quarter. If priorities shift and we aren't able to get to it, we'd be happy to see a PR for it. We are still in the process of adding support to Vector's internal data model for traces (https://github.com/vectordotdev/vector/pull/10483). That will need to happen first.

tsloughter commented 2 years ago

I was curious and checking if this existed yet and found this issue.

Please post to this issue if your priorities shift -- and if anyone else starts working on this please let it be known here :). I ask this because I'm always looking for more ways to learn more Rust and reasons to bother @blt :) -- I already work on OpenTelemetry, so thought this would be a good way to cover both those, assuming I can find the time myself.

blt commented 2 years ago

I ask this because I'm always looking for more ways to learn more Rust and reasons to bother @blt :)

:wave:

jszwedko commented 2 years ago

😄 will do. This issue will be assigned if we start work on it.

yandooo commented 2 years ago

@jszwedko is this issue still on Vector's team radar? Looks like greatly deprioritized multiple times since 2019. Thanks

spencergilbert commented 2 years ago

@jszwedko is this issue still on Vector's team radar? Looks like greatly deprioritized multiple times since 2019. Thanks

We have this scheduled for the upcoming quarter - starting with traces and going from there based on the state/stability of the event type 👍

Oloremo commented 2 years ago

Any updates? 🙏

jszwedko commented 2 years ago

Any updates? 🙏

This is likely to be on our roadmap for Q3.

tonychoe commented 2 years ago

We have the use cases to collect logs with Vector and send them to various sinks including OTLP-compatible endpoints. Vector's support for OTLP would be fantastic. :)

kamalmarhubi commented 2 years ago

@jszwedko

Any updates? :pray:

This is likely to be on our roadmap for Q3.

Any % you can put on how likely "likely" is? I'm planning out an observability pipeline project and I'd prefer to use Vector over Opentelemetry Collector for VRL and the Lua transform, but this has been repeatedly pushed down the Vector team's priority list so it's hard to plan around :-)


Edit: Oh I just saw that a logs source is possibly getting close to merged in #13320 which is a great start on this. Makes it seem more real!

spencergilbert commented 2 years ago

Any % you can put on how likely "likely" is? I'm planning out an observability pipeline project and I'd prefer to use Vector over Opentelemetry Collector for VRL and the Lua transform, but this has been repeatedly pushed down the Vector team's priority list so it's hard to plan around :-)

Edit: Oh I just saw that a logs source is possibly getting close to merged in #13320 which is a great start on this. Makes it seem more real!

We discussed roadmaps at the end of last week and the plan is for me to work on sources and sinks for logs/metrics/traces all quarter long.

caibirdme commented 2 years ago

We discussed roadmaps at the end of last week and the plan is for me to work on sources and sinks for logs/metrics/traces all quarter long.

Hi @spencergilbert , r there any task lists that we can pick up to accelerate this process?

spencergilbert commented 2 years ago

Hey @caibirdme, it sounded like you were interested in OTel logs sink support? If that's the case I opened https://github.com/vectordotdev/vector/issues/13622 to track and discuss.

If you'd prefer to work on a different portion of the source or sink, that's fine too. Just let me know!

caibirdme commented 2 years ago

Hey @caibirdme, it sounded like you were interested in OTel logs sink support? If that's the case I opened #13622 to track and discuss.

If you'd prefer to work on a different portion of the source or sink, that's fine too. Just let me know!

Yes, I'm interested in that because we eagerly want this feature in our system. We're building a new observability system based on opentelemetry and clickhouse. After supporting the opentelemetry source, we can ingest log(otel and non-otel log), uniform those data by vrl and sink log to the clickhouse. But the problem is that the vector/clickhouse-sink which based on http protocol & JSONEachRow Format, is ineffecient and incurs heavy overhead on clickhouse. There're two ways to solve this:

  1. optimize clickhouse sink by switching to clickhouse native protocol or support more effecient format such as apache arrow or parquet
  2. support otel sink, export log to opentelemetry-collector which contains a high performance clickhouse exporter(the reason we do not using otel-collector directly is that our logs are in different format, we need vrl to uniform them)
spencergilbert commented 2 years ago
  1. optimize clickhouse sink by switching to clickhouse native protocol or support more effecient format such as apache arrow or parquet

I'd recommend opening an issue regarding the clickhouse sink, if you haven't/if there isn't one already. It's definitely out of scope for this issue/my current work - we'd be happy to look into improvements or review a proposed contribution to address the issues you've had.

It sounds like for your use-case it would ultimately be better to improve the clickhouse sink so you could drop an additional tool from your pipeline, but in the end it's definitely more about what and where you'd like to contribute.

cetanu commented 2 years ago

Is the work for this sliced up into actionable pieces somewhere that I can look at? Would like to contribute to this

jszwedko commented 2 years ago

Is the work for this sliced up into actionable pieces somewhere that I can look at? Would like to contribute to this

Hi @cetanu ! Our nearterm focus is on the source: adding support for ingesting metrics and traces. If you were interested in contributing to this, creating a new opentelemetry sink to send data via OTLP would likely be a good place to start.

ahlfors commented 2 years ago

Hi, is there any one done benchmark or test on otel-collector lately? Since @binarylogic 's words posted near 3 years ago - "Their collector and libraries are of questionable quality" , a benchmark or test on current version of otel-collector will be very supportive, If any one has done this, could you share it out(or leave your conclusion if it can not be shared) ?

update: I found a benchmark report of aws-otel-collector with version v0.21.0 (Latest on 2022-9-5), for u ref: https://aws-observability.github.io/aws-otel-collector/benchmark/report

I found vector has issue related for this with still open status(2022-1-15): https://github.com/vectordotdev/vector/issues/13132

tsloughter commented 2 years ago

3 years ago the collector was alpha. It may not be 1.0 yet but it is used by many companies, including by those acting as SaaS APM collectors of other people's traces, metrics and logs.

ericsampson commented 2 years ago

@jszwedko is Spencer still working on these efforts?

spencergilbert commented 2 years ago

@jszwedko is Spencer still working on these efforts?

👋 I've been pulled off this work for the past couple weeks between time off and some priority updates to internal metrics. I should be starting on the source's trace support next week, followed by metrics.

NasAmin commented 2 years ago

Hi, is there any update for metrics support?

jszwedko commented 2 years ago

Hi, is there any update for metrics support?

Hey! We plan to pick this back up again in Q4 and target releasing by the end of the year.

NasAmin commented 2 years ago

👋 Just checking in, is this still on track?

jszwedko commented 2 years ago

👋 Just checking in, is this still on track?

Unfortunately this is likely to end up being delayed. I'm hopeful that we can at least work on ingesting metrics this quarter.

bsod90 commented 2 years ago

Lol, I've subscribed to this issue just to see a Datadog PM coming here once a quarter just to say "we're going to prioritize this in Q". Honestly, I think OPT export is against datadog business priorities and it's never happening, period.

ericsampson commented 2 years ago

@jszwedko would it be possible for y'all to consider doing traces first? Thanks

jszwedko commented 2 years ago

@jszwedko would it be possible for y'all to consider doing traces first? Thanks

Traces is likely to be a bigger lift since it is a relatively new data model in Vector and so is likely to have more unknowns. I think it is likely we will prioritize: ingesting metrics, egressing logs, egressing metrics, ingesting traces, egressing traces, in that order.

NasAmin commented 1 year ago

👋 Just checking in, is this still on track?

Unfortunately this is likely to end up being delayed. I'm hopeful that we can at least work on ingesting metrics this quarter.

@jszwedko Just checking, any update on supporting Otel metrics?

jszwedko commented 1 year ago

👋 Just checking in, is this still on track?

Unfortunately this is likely to end up being delayed. I'm hopeful that we can at least work on ingesting metrics this quarter.

@jszwedko Just checking, any update on supporting Otel metrics?

Nothing definite yet, but it is still a goal.

KFearsoff commented 1 year ago

I'd like to get the process going. Any PRs I can put out to get this done sooner rather than later?

spencergilbert commented 1 year ago

Hi @KFearsoff 👋

I do have adding metrics support to the existing opentelemetry source as a work item later this quarter, but otherwise we don't have any existing PRs to add functionality. Was there a particular area of functionality you were looking for first?

KFearsoff commented 1 year ago

I do have adding metrics support to the existing opentelemetry source as a work item later this quarter

This thread consists of "next quarter". I don't mean any offense to you or the Vector team, but I'd like to take the initiative given the opportunity 😄

Was there a particular area of functionality you were looking for first?

I'm particularly interested in tracing source and sink. I don't really care about metrics or logs, but I'm guessing there will be some overlap (like trace -> log transformation). I'd like to get the sources and sinks done before worrying about transforms, though.

otherwise we don't have any existing PRs to add functionality

I don't mind opening ones 😉 Will be referring to the merged OTel parts, Datadog Agent's code and the development docs from this repo. Hopefully I'll be able to come up with something that gets the job done.

spencergilbert commented 1 year ago

This thread consists of "next quarter". I don't mean any offense to you or the Vector team, but I'd like to take the initiative given the opportunity 😄

No offense taken! I've been disappointed personally with not being able to work on it, but that's just how prioritization goes some times 🙂

I'm particularly interested in tracing source and sink. I don't really care about metrics or logs, but I'm guessing there will be some overlap (like trace -> log transformation). I'd like to get the sources and sinks done before worrying about transforms, though.

It looks like we have a few issues open for transforms:

I'd agree that we need more integrations for receiving and sending trace events before the demands for transforms hits a breaking point. I'm not even sure all of the existing transforms have been updated to support receiving traces.

I don't mind opening ones 😉 Will be referring to the merged OTel parts, Datadog Agent's code and the development docs from this repo. Hopefully I'll be able to come up with something that gets the job done.

👍 sounds good! I think adding trace support to the existing opentelemetry source would make the most sense, and be an easier undertaking than adding a completely new sink. If you'd like to collaborate/discuss I'll open a specific issue for enhancing the source or we can chat in our Discord server.

I suspect much of the code for trace support should be straight forward, currently the trace data model is identical to our log data model (type wise) so I don't expect there will be too many surprises from the existing log implementation.

ericsampson commented 1 year ago

Godspeed @KFearsoff 🫡

KFearsoff commented 1 year ago

👍 sounds good! I think adding trace support to the existing opentelemetry source would make the most sense, and be an easier undertaking than adding a completely new sink. If you'd like to collaborate/discuss I'll open a specific issue for enhancing the source or we can chat in our Discord server.

I suspect much of the code for trace support should be straight forward, currently the trace data model is identical to our log data model (type wise) so I don't expect there will be too many surprises from the existing log implementation.

I'm mostly unsure if it would be best to create a dummy source that would take in traces in the current trace data model, or if it's best to start with changing the trace data model (as described in RFC about OTLP traces). I've considered starting with the trace data model, because I think traces and logs are quite different, but I've looked over it and I think starting with that might be a little too ambitious for me 😅

Not sure if treating traces like logs is fine, though (especially because it would make it quite hard to create a sink). What do you think?

I'll join Discord a little bit later, eager to start!

spencergilbert commented 1 year ago

Not sure if treating traces like logs is fine, though (especially because it would make it quite hard to create a sink). What do you think?

I think given that's how we're handling Datadog traces today it should be fine, I'd agree that updating the data model is a much bigger task and would require a lot more consensus and coordination. I'll check what the rest of the team thinks and get back to you.

spencergilbert commented 1 year ago

I'm going to switch over to https://github.com/vectordotdev/vector/issues/17307 for further discussion specific to adding trace support here 👍