vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.66k stars 1.56k forks source link

New source: "loki" #6873

Open afoninsky opened 3 years ago

afoninsky commented 3 years ago

For now, Vector has "loki" sink which is responsible for delivery logs to Loki. Loki also has functionality to consume live logs so its possible to create a source: https://grafana.com/docs/loki/latest/api/ -> /loki/api/v1/tail Theoretically, it will allow to execute queries with live tailing and create much more flexible processing pipelines.

marcbachmann commented 3 years ago

In case somebody wants to work on that, here's some example code how to merge multiple loki streams into one array of logs: https://github.com/livingdocsIO/loki-log-export/blob/3488c019d5f0c2525cdc844e02c3a1a8a7d3e8d2/merge-logs.js#L3-L13

joemiller commented 2 years ago

Vector could also implement a loki source listening on the /loki/api/v1/push API endpoint. Basically acting like the loki receiver in a typical promtail -> loki setup.

One thing that this would allow for is a path to incrementally move from a promtail/loki setup to a full vector pipeline. For example, in a promtail -> loki setup one could introduce vector aggregator into this pipeline easily: promtail -> vector-aggregator -> loki. Once the aggregator is in place it becomes straightforward to add additional outputs to your log pipeline: gcs, s3, bigquery, etc. Over time promtail should be replaced with vector agent. By introducing vector incrementally it gives the org time to convert their promtail config (filters, relabels, etc) and test safely before swapping promtail for vector agent.

kbudde commented 1 year ago

Good morning,

From what I can see there are two interesting use cases here.

(1) Described in this issue (initially), Loki is used as a source to retrieve the data as they are stored. Vector actively reads the data from the system. Most comparable to the prometheus_scrape source .

Use cases:

This was also implemented by @zamazan4ik in #15405. Unfortunately closed. Behavior when Vector is not running: A gap is created in the downstream systems. The data can be viewed in Loki.

(2) The second suggestion in this issue makes Vector part of the processing chain. The logs are pushed to Vector. Most comparable to source prometheus_remote_write where it mimics a receiver endpoint.

Use cases:

Behavior when Vector is not running: Logs cannot be sent. Load on the clients to buffer the logs.

In my view, both approaches make sense and something similiar has already been implement for promtheus. Which makes sense as a loki is like prometheus, but for logs.

Overview:

Grafana agent
or Promtail ---> Vector (2) ----> Loki <-- (pull)-- Vector(1) --> Sink

@jszwedko what is required to bring these improvements to vector? I can create RFCs for both or create a second issue describing option 2. I'm not sure if I can support with implementation (never did something in rust) but maybe preparing the road will unblock motivated people like @zamazan4ik .

jszwedko commented 1 year ago

Hi @kbudde !

Just FYI that there is another contributor working on a websocket source over here: https://github.com/vectordotdev/vector/pull/17856

That seems likely to go in soon, after which I think a specialized loki pull source could be layered on top that pulls logs from Loki via its live tail API.

It additionally also seems sensible to have a loki API source that exposes the same API as Loki for clients like promtail to push to (your option 2). I think we would be amenable to that too. It seems like it could be a fairly light wrapper around the http_server source.

bryanyork commented 11 months ago

I would love an integration for Promtail instances to send data to Vector.

GreyTeardrop commented 9 months ago

I would love for Vector to have a Loki API source. My use case: there's a tool (Unpoller) that supports sending log events to Loki only. Would be great to use Vector with it instead to be able to forward logs to other systems.

labmir commented 8 months ago

We have many docker containers in production that are configured to to stream to loki server. I want to put Vector instead of loki server so that in followup I can redirect logs to both: elasticsearch(our coders for debugging) and Grafana/Loki (our support engineers). So yes, +1 for the Loki as source.

zamazan4ik commented 8 months ago

@labmir if I understand your use case correctly, you need to use already existing Loki sink.

labmir commented 8 months ago

@zamazan4ik Thank you, but I must've not explained it well. The reason we want Loki as source is because our docker daemon is configured to log to Loki and changing log driver for docker requires to restart docker service and rebuild docker compose which is a down time I want to avoid on the production. So we want to stop our Loki server and launch Vector on the same IP:PORT where Loki used to listen and tell Vector to redirect logs to both Loki (launched on a different IP:PORT) and Elastic

pgassmann commented 3 months ago

there are two types of loki sources discussed here:

vector acting as loki server would be useful for us for two reasons:

LinTechSo commented 3 days ago

Hi, same requirement. any updates?