grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.89k stars 3.45k forks source link

Batch export of stream over timerange #6840

Open Tristan971 opened 2 years ago

Tristan971 commented 2 years ago

Is your feature request related to a problem? Please describe. In #409 a few of us were asking for a practical way to export logs from Loki. Eventually logcli got out of the box batching support, which does help a lot in that regard.

However it still doesn't really address cases where one needs a large dump of logs. In this case I'm currently puzzled about the best way to get the logs of a single stream (keeping things simpler) over a timerange.

In principle that's quite easy, but here that represents around 150 million messages. So even with batches of 5000 it's going to take just about forever and be rather expensive in the process (likely going to cause some pruning of memcache/fifo while at it).

As a result for now I'm trying to make do with only 3% of that (4.5 million), but it's not exactly ideal...

Describe the solution you'd like I'd love a solution that allows me to export at least a large stream of logs over a timerange. Better would be the streams matching a set of labels. And ideal would be an arbitrary query.

Now it doesn't really make sense for a full query to be able to be made faster for this purpose, as otherwise all queries would be faster in such a way, so I'm not expecting that. This is sort of why I specified that a stream would be enough, in which case "only" the chunk fetch and decompress is involved, which (I think? could be wrong) should be significantly faster.

Describe alternatives you've considered The most efficient way to do this today would be to write some kind of tool using loki's internal libraries to directly find, download and decompress chunks straight from the underlying storage, basically loki without the queries.

And to be honest it would be a very fair outcome of this for logcli to be able to do that.

Additional context I don't think it's very relevant but I'm trying to pull some (noisy) debug traces to diagnose a rather subtle bug and can't just share access to loki with the persons involved in the debugging process...

EraYaN commented 1 year ago

This would also be very very useful if you ever need to make dumps for vendor support (cloud vendors etc). They can't and wont interact with your loki instance of course but they might need everything from the last 24 hours for example. And doing that with batches of 5000 is absolutely ridiculous, especially for larger log lines it takes a while.

Essentially you want something like postgres COPY semantics, you have postgres write it to a file on the server or "stdout" and it tracks progress in some special table/view.

Although running queries would still be very useful, even if it's a reduced set (maybe only label based filtering?)

jeschkies commented 1 year ago

@Tristan971 when you write "stream" you mean a log stream not an API response stream, right?

I'm all in favor of a kind of Apache Arrow Flight batch export that only accepts a time range and stream selector.

What kind of user cases do you have in mind?

Tristan971 commented 1 year ago

when you write "stream" you mean a log stream not an API response stream, right? I'm all in favor of a kind of Apache Arrow Flight batch export that only accepts a time range and stream selector.

That is exactly what I had in mind yes!

What kind of user cases do you have in mind?

At the moment the main use-case for me is purely that of extracting large amounts of logs for external processing (in my case it was for sharing a large amount of trace-level debugging logs, but I could see it being relevant for things like BI-type analysis, feeding some kind of nightly SIEM job, ...).

EraYaN commented 1 year ago

And I feel it should also work without a stream selector. In a "Get everything mode". Vendors get very very pissy about "filtered logs" as do auditors.

jeschkies commented 1 year ago

@Tristan971 and @EraYaN the new parallel flag for logcli in https://github.com/grafana/loki/pull/8518 might be interesting for you. It would enable a batch download of all logs.

Tristan971 commented 1 year ago

A sharding approach like that certainly sounds like a pretty big improvement already indeed; not quite retrieving at wire/disk speed quite yet, but it always was more of a "1h rather than 2d" feature request, so that's good enough for me 👍

Thanks!

gyoza commented 1 year ago

This is a quote from somebody in my company Yeah but bear in mind we are not an organization made entirely of shell using people

Is there going to be some kind of UI export available in grafana or a way to link to the logs outside of giving the custom dashboard link?