open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.07k stars 2.37k forks source link

[receiver/Loki] Open discussion about the name and possible extension #21096

Closed gillg closed 1 year ago

gillg commented 1 year ago

Component(s)

receiver/loki

Is your feature request related to a problem? Please describe.

Hello,

@mar4uk I just discovered you rolled out a new Loki receiver 3 days ago and that's great!

I just ping you because I'm wondering if the name is not confusing. To me a Loki receiver would be a receiver able to connect on the tail websocket API with a prebuilt query to read incoming logs from Loki.

The use cases could be a deeper indexing/analysis of some kind of logs (eg. Fatal/errors) for grouping, large time window searches, etc. That could also be to centralize this conf in one Central point instead of on each collector agent in possible large infrastructure pipelines. Why not also for some reindexing/ conversation to another system, etc.

Describe the solution you'd like

Your approach is currently an emulation of a Loki push endpoint, so like we have a Prometheus remote write receiver, I would name it something like Loki push receiver or something like that.

Else, why not combine the two modes of reciever with some configs...?

I'm happy to discuss and share your vision.

Describe alternatives you've considered

No response

Additional context

No response

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

mar4uk commented 1 year ago

Else, why not combine the two modes of reciever with some configs...?

I tried to implement promtail receiver in the past that combines both modes: pull and push. But that led to bringing to the collector a lot of dependencies from Loki. Those dependencies had vulnerabilities and it would be difficult to maintain them in the future. That's why it was decided to not go with both modes.

so like we have a Prometheus remote write receiver, I would name it something like Loki push receiver or something like that.

We don't have prometheusremotewritereceiver, we have prometheusreceiver. (there is prometheusremotewriteexporter) I think the name lokireceiver aligned with the name prometheusreceiver and also it makes room to implement the pull mode in the future if we would want to have another try

gillg commented 1 year ago

My bad ! I confused my mind about receivers / exporters. So yes the name is consistent. I'm just still divided about the concept, but if everyone is fine with the name I have not more arguments ^^

About the dependancy on Loki libs, if we use the tail API I don't really understand. By "emulating" the loki API in push mode, you necessarly include the loki libs in otel collector (same story for the exporter by the way) no ? To be able to consume the API https://grafana.com/docs/loki/latest/api/#stream-log-messages we don't need to implement a specific Loki contract, so no dependancy on Loki ? It's a generic webservice with some GET query parameters, and the query should be a free string parameter in the otel collector config. Am I wong on these assumptions ?

mar4uk commented 1 year ago

O, I see, I didn't understand at first that you meant https://grafana.com/docs/loki/latest/api/#stream-log-messages, I thought that by Else, why not combine the two modes of the receiver with some configs...? you meant log file discovery https://grafana.com/docs/loki/latest/clients/promtail/#log-file-discovery.

That's an interesting case to use tail API in the collector... I personally don't see much sense to use a collector in those scenarios:

The use cases could be a deeper indexing/analysis of some kind of logs (eg. Fatal/errors) for grouping, large time window searches, etc. That could also be to centralize this conf in one Central point instead of on each collector agent in possible large infrastructure pipelines. Why not also for some reindexing/ conversation to another system, etc.

If the user wants to address those scenarios he probably doesn't need to convert loki logs into OTLP format

gillg commented 1 year ago

If the user wants to address those scenarios he probably doesn't need to convert loki logs into OTLP format

I don't see a "need" either, but more a convinience. A good scenario is for software problems dispatching. It's the kind of features existing in "Sentry" as an exemple. So why not ingest all the app logs to loki, but dispatch some of them based on a loki query to Sentry thanks to OtelCollector.

You can definitely build your own middleware, but reusing the power of an otel collector agent could be very interesting.

mar4uk commented 1 year ago

A good scenario is for software problems dispatching

Loki has Alerting and Recording Rules feature that allows to query loki logs and send alerts. This feature would help to dispatch software problems, right?

Sentry is usually used together with a logging system. So basically if there is a need to dispatch something to Sentry, then probably Sentry is already used to collect events. This means there is no need to introduce another tool to reproduce the behavior Sentry already does.

gillg commented 1 year ago

I'm aware of the ruling capabilities of Loki but I'm not completely agree about the use case you propose. Having recorded rules doesn't allow to group logs by a kind of custom signature then being able to dispatch them to the dev team for investigation + record an history of the exact logs what you can search for large time windows. To me, the biggest trade-off of Loki is the fact you can't make queries on multiple weeks or months periods. It's definitely acceptable to say that it's not the goal of Loki, and it already does very well the ingesting job even with very high volumes.

But being able to couple it to another system more collaboration or tasks management oriented, thanks to a bridge, would help a lot. In that case, instead of reinvent the wheel, with a central otel collector next to Loki, we could reingest on the fly some specific logs to another system.

Agreed, that dispatching can be done at the log producer level with the otel collector, before sending the logs to Loki, but having a central collector to do that after the Loki ingestion, simplify a lot the deployment / configuration perspective on a large network with a lot of etherogenous systems.

mar4uk commented 1 year ago

Alright, thank you for the detailed answer!

I would park this proposal for now because the loki receiver is new and we don't have much feedback about its usage. If after a while we see that users want this feature we can consider its implementation

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been closed as inactive because it has been stale for 120 days with no activity.