elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
100 stars 4.92k forks source link

[Libbeat][Filebeat][outputs] Add Codec and exporter to serialize and export batch of beat events in `OTLP` format #32549

Open shivanshuraj1333 opened 2 years ago

shivanshuraj1333 commented 2 years ago

Describe the enhancement:

To make Filebeat compatible with Opentelemetry, i.e. exporting beat events in OTLP format so that opentelemtry collector can ingest Filebeat beat events, we need a codec to serialize the batch of beat events into OTLP Log Data Model.

To do that, we need to add:

  1. a Codec which will be converting a batch of beat events into an OTLP log data model, the codec should be added here libbeat/outputs/codec in addition to existing format and JSON codecs.
  2. an Exporter which will be supporting OTLP over HTTP/gRPC protocol, this exporter needs to be added here libbeat/outputs alongside kafka, logstash, and redis exporters and can be called as otlp

Implementation details:

Copy of error exception(1)

The above diagram explains the translation of a batch of beat events into OTLP Log Data Model.

Describe a specific use case for the enhancement or feature:

Filebeat is a great tool, and with the increased adoption of Opentelemetry it's really nice to make Filebeat compatible with Opentelemetry.

elasticmachine commented 2 years ago

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

shivanshuraj1333 commented 2 years ago

@andrewkroh , can you help me in getting this reviewed? So that I can open a PR. Thanks!

cmacknz commented 2 years ago

Hi @shivanshu1333, we are thinking about this internally. I'm not sure when we'll have a decision.

For context, adding new outputs has a long tail of work in the form of ongoing user deployment support after they are introduced. Writing the code is only part of work here. We need to be careful about when we introduce new outputs to ensure we can properly support and test them in the various environments and configurations they will be deployed in.

shivanshuraj1333 commented 2 years ago

Hi @cmacknz, thanks for responding. Do we have any decision yet?

The changes required for outputs are ready and tested locally; the changes include 1) one output codec for otlp which will be added here (libbeat/outputs/codec) 2) one otlp exporter which will be added here (libbeat/outputs) These two changes alone are complete to add this feature.

Can you please elaborate on

"adding new outputs has a long tail of work in the form of ongoing user deployment support after they are introduced"

To resolve this

We need to be careful about when we introduce new outputs to ensure we can adequately support and test them in the various environments and configurations they will be deployed in

How about letting the changes in? Let's not release it until it's thoroughly tested and mark this as an alpha feature whenever we're ready to release; eventually, we can mark it as a beta feature.

This way, we can start progressing in the right direction, the changes will be well tested, and we'll make Filebeat compatible with OpenTelemetry Collector, increasing adoption of Filebeat among users of OpenTelemetry Collector.

Kindly let me know your thoughts.

cmacknz commented 2 years ago

We are still thinking about when and how best to support OpenTelemetry, I don't think there will be a decision with respect to Beats soon.

Besides the Beats' maintainers making the time to support and test an OpenTelemetry output to the extent we would like, we would ideally like to wait for the OpenTelemetry logging protocol to stabilize (currently experimental).

There is also an active proposal to adopt the Elastic Common Schema we use in our logs into OpenTelemetry (see https://github.com/open-telemetry/oteps/issues/197) and we would ideally wait for a decision on that proposal as well.

hartfordfive commented 2 years ago

There are multiple configuration options in filebeat which are also indicated as experimental (see full config for v8.4. Having this considered, would it not be reasonable to add in this feature and clearly state that it is experimental? I think it's safe to assume Opentelemetry is here to stay especially as it's been a CNCF project since May 2019. It's starting to gain some popularity in the observability realm for many reasons. Having said that, I realize that Elastic does have ECS to adress consistency with log schemas but that's not an option that everyone wants to necessarily implement as it's solution specific.

shivanshuraj1333 commented 1 year ago

@joshdover this is the original tracking issue with the idea to introduce OTLP codec and exporter, now we have modified our code to do serialization on the fly in the exporter itself. And we've a separate package for metrics.

This is in continuation with your discussion with Lalit. Should we raise a PR?

hartfordfive commented 1 year ago

Regarding the following comment in the linked issue which asks if a special receiver should be built in the opentelemetry-collector or if filebeat should implement an OTLP output option, it would be beneficial for filebeat to have the capability to encode in the OTLP format. I'm wondering if filebeat could also add a new processor (e.g.: encode_otlp) which could allow the log events collected to be transformed to the OTLP log data model defined here. This could also potentially simplify applying a specific schema directly in filebeat for some of the sub-fields. (e.g.: standardizing key names in the "Attributes" section). Considering OpenTelemetry has adopted ECS for schema standardization, I feel this approach seems it would make sense.

As an example, someone could encode messages in the OTLP model and then publish them to Kakfa. They could then consume them from the topic, process them with any tool and then index them into Elasticsearch.

secustor commented 1 year ago

As an example, someone could encode messages in the OTLP model and then publish them to Kafka.

This is a similar setup we use and we are considering dropping beats and logstash as we want to standardize on OTLP.

joshdover commented 1 year ago

Hey everyone, I'd like to add a small update from the Elastic side here.

We've been discussing how best to add OpenTelemetry support to our ingestion components and while there will be many different paths, we do think that an OTLP output directly in Beats is a good move for the ecosystem. We're open to pull requests for this and we've already been discussing with some contributors about donating their own private implementations (eg @shivanshu1333 and Lalit). For now, we'd like to focus on support for logs as it's a simpler translation than metrics, which require more metadata, such as the metric type (counter, gauge, etc.).

We prefer an OTLP output in Beats over building a lumberjack receiver in the OpenTelemetry collector, as there are several deficiencies in the lumberjack protocol (such as no mechanism for backpressure) that we don't believe set up Beats users for success with OTel. We'd rather put effort towards improving the OTel data model (through the ECS/SemConv merger) and protocols directly (such as https://github.com/open-telemetry/opentelemetry-proto/issues/470).

We also think enabling the large installation base of Beats to start sending data to OTel systems natively, without requiring OTel Collector in the middle, is a win for interoperability and making OpenTelemetry more widely available.

I'm wondering if filebeat could also add a new processor (e.g.: encode_otlp) which could allow the log events collected to be transformed to the OTLP log data model

@hartfordfive What's the use case for breaking this out as a separate processor instead of embedding this logic into an otlp output? Do you want to use the data model with other outputs? Would the OTel collector be able to serve this purpose?

As the ECS+SemConv merger makes progress and stabilizes, we'll likely have several options for translating from one schema to the new merged schema. For now, I lean towards keeping things simple until more of these details are figured out, and keep any translation in the output itself. Definitely open to more discussion on this point.

sgarc57 commented 7 months ago

Any update on this? Is there some existing code available for testing?

cmacknz commented 7 months ago

There is nothing in the Beats repository to try yet but we are actively working on integrating with the OTel ecosystem on a few different fronts. In no specific order:

  1. Improving the OTel collector's Elasticsearch exporter.
  2. Improving the OTel collector's filelog receiver.
  3. Enabling Beats and Elastic Agent to output OTLP.
  4. Other work not specific to Beats (like APM and tracing).

We are still very early in this process and the initial work is focused on making changes in the upstream OTel repositories to make them easier to use with the Elastic stack. There will be more updates in the coming months.

shivanshuraj1333 commented 7 months ago

@cmacknz is there upstream issues in OpenTelemetry that you can add as reference?

cmacknz commented 7 months ago

There isn't a single tracking issue yet, we are still early in the process and haven't finalized the plan for publishing+accepting OTLP data given the data model is different from what exists in the Elastic stack today.

Vincehood commented 3 hours ago

Any update on this?

cmacknz commented 3 hours ago

There is still no official tracking issue yet as we are still in what we'd consider the prototyping phase, though we are getting close to the end of it. Since this issue was created a few things have happened:

  1. Elastic now has an OTel collector distribution, EDOT: https://www.elastic.co/observability-labs/blog/elastic-distributions-opentelemetry
  2. The EDOT collector is really Elastic Agent under the hood, with the collector running inside the elastic-agent process.
  3. We are in the process of making it possible to run Beat inputs inside the EDOT collector as receivers as part of the collector pipeline. https://github.com/elastic/elastic-agent/pull/5833 adds a "Filebeat receiver".

For now, the Beat inputs running as receivers (Beats receivers) are only being tested with the elasticsearch exporter and output documents in ECS format that look identical to what you'd get out of Beats. This is so they can be used with existing modules and integration assets without breaking everything because OTLP is a totally different data schema.

The follow up work once we have this working with the ECS schema would be to let the Beats receivers output data in OTLP format, providing what this issue is asking for. However it will happen in a different way than originally proposed here, the Beat inputs will run inside the Elastic OTel collector distribution instead of putting an OTLP output in Beats.

There will be some clearer communication once we have the whole end to end story here worked out and all of the prototyping has wrapped up.