kyma-project / telemetry-manager

Manager for the Kyma telemetry module
https://kyma-project.io/#/telemetry-manager/user/README
Apache License 2.0
5 stars 24 forks source link

Logging OTLP support #556

Open a-thaler opened 1 year ago

a-thaler commented 1 year ago

Motivation The telemetry module was initially designed to be based fully on the OpenTelemetry project. As the logs domain of the project was not stable yet and also the adoption is simply not there yet, the module was released having the application logs based on Fluentbit using the proprietary HTTP output. The SAP Cloud Logging backend was not supporting OTLP as well.

These criterias changed and the adoption slowly kicks-in:

The client adoption brings the benefit of streamlining the setup by introducing a gateway like done for traces and metrics already, allowing clients to directly push logs without the indirection via stdout, enabling more opportunities like enabling easy collection of kubernetes event logs,

Making logs based on OTLP will allow a streamlined approach to telemetry data across traces and metrics. All data can have streamlined attributes following streamlined semantics.

Also the proprietary protocol used with fluentbit is strongly aligned with the SAP Cloud Logging API and is not common for ingestion to other systems, so the usage is very limited.

Fluentbit as technology is different to the otel-collector framework and not that flexible. A lot of synergies can be seen in the telemetry-manager between the traces and metrics domain and could be applied to logging as well, simplifying the code base and maintenance.

Goal and requirements

Goal is to make logs based on the OTLP protocol (for backend ingestion to support more providers, for client ingestion to avoid indirection via stdout if not desired), leveraging the otel-collector framework to have a streamlined technology stack, still supporting collection of logs via stdout.

The requirements can be split into three parts:

The new API must be available in parallel to the old API which will get deprecated but will stay till there is no usage anymore. With https://github.com/kyma-project/kyma/issues/15932 fluentbit introduced support for OTLP and a first iteration might introduce the OTLP output based on fluenbit. However, first investigations revealed that fluentbit does neglect OTLP resource attributes, especially the enrichment of relevant k8s metadata will be not possible. With that a direct jump to the otel-collector seems to be needed (which is desired anyway).

Targetted Architecture logs drawio

Actions

a-thaler commented 1 year ago

The testing of the OTLP output revealed serious problems. Some got fixed with 2.0.9 but still problems are present.

kyma-bot commented 1 year ago

This issue or PR has been automatically marked as stale due to the lack of recent activity. Thank you for your contributions.

This bot triages issues and PRs according to the following rules:

You can:

If you think that I work incorrectly, kindly raise an issue with the problem.

/lifecycle stale