vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.58k stars 1.54k forks source link

Time formats in sinks should be derived from events #1743

Open anton-ryzhov opened 4 years ago

anton-ryzhov commented 4 years ago

It was unclear for me which timestamp is used for making filename. It is not documented anywhere.

My first expectation was to get there timestamp of last event, or maybe timestamp of first event.

But current timestamp is least useful in my point of view — there is a file metadata with creation time. And it doesn't name files as expected when historical data is being processed.

Seems this idea is not new for you https://github.com/timberio/vector/blob/6f290d3e55d78438100b1bd31747c6c4b1630184/src/sinks/aws_s3.rs#L318 but I couldn't find a ticket to track that.

binarylogic commented 4 years ago

Thanks! Definitely agree basing this the event's timestamp is much better.

jszwedko commented 3 years ago

Related: https://github.com/timberio/vector/issues/9079

splix commented 2 years ago

I'm having the same issue with GCP Cloud Storage Sink (and Files Sink for testing). Is it possible to allow using the event timestamp for those sinks as well?

jszwedko commented 2 years ago

Thanks for the note @splix ! I agree, we should be using the event timestamp for deriving any timestamp related configuration in sinks. I'll update this issue to reflect that.