Closed krasi-georgiev closed 5 years ago
But this is an incomplete snapshot then? We can basically not persist 2hrs of ingested data with this.
why 2h? chunk have a maximum of 120 samples.
https://github.com/prometheus/tsdb/blob/ba7f8025a9101663519d34020c283ed8f5ca79ff/head.go#L1695
yes there will be some missing data, but guarantees that the data won't be corrupt.
It's not safe to do this, you need a clean break in the data. If a user wanted this they can skip the head block.
Why not safe? this is skipping only few of the last samples not the entire head.
I looked at the code quite a bit and otherwise chunk.Bytes()
could always return corrupt data.
Why not safe? this is skipping only few of the last samples not the entire head.
It's randomly skipping different amounts of data for different series.
I guess we can run some additional check to ensure all chunks are cut at the same time. So i understand the problem with different timings Can you give an example of a query that will cause issues?
Do you have any other idea to solve this race?
Yes. Reencode those moving chunks: https://github.com/prometheus/tsdb/pull/639
I thought about ignoring, but it's indeed quite bad if it can skip up to 2h of data.
Did you see my comment in your pr? I think it still has a race there.
Btw at the most common interval of 15sec and 120samples per chunk that is max of 30min of data loss i think
Maybe we can combine both prs. Which is To Reincode all incomplete chunks(head chunks)
I am not saying my PR is perfect. Even CI is saying it's not but my main focus is on remote read TBH so joining forces would be nice. I can push my changes to some tsdb Branch for easier collaboration @Krasi (:
On Thu, Jun 27, 2019, 15:50 Krasi Georgiev notifications@github.com wrote:
Maybe we can combine both prs. Which is To Reincode all incomplete chunks(head chunks)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/prometheus/tsdb/pull/641?email_source=notifications&email_token=ABVA3OZCWMBRJNAWTYTKKO3P4THVFA5CNFSM4H32UOK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYXLXYY#issuecomment-506379235, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVA3O3VPX5Q3FW2SWCBTV3P4THVFANCNFSM4H32UOKQ .
I had a look at this and #639 , and for some reason I find re-encoding the last chunk safely and using it seems to be a better option than ignoring the last chunk. Reasons would be (1) What @brian-brazil said (2) Snapshot as much data as possible.
I have just merged both PRs
Ok, l might be biased but I think having MaxTime == max
is bit better way of saying if it's full. This is because it's quite natural and does not requre change in the interface. What do you think about closing this to avoid confusion and trying this approach on https://github.com/prometheus/tsdb/pull/639 ? Or we can start TSDB branch a new PR for better collab.
ready for another review
Another idea would be to pass somehow else that it is a head block (both open and closed ones) and treat those differently in terms of chunk maxTime being outside of desired meta.MaxTime
Another idea would be to pass somehow else that it is a head block (both open and closed ones) and treat those differently in terms of chunk maxTime being outside of desired meta.MaxTime
That's probably easiest at this stage.
Another idea would be to pass somehow else that it is a head block (both open and closed ones) and treat those differently in terms of chunk maxTime being outside of desired meta.MaxTime
could you explain a bit more about this idea?
No worries. Essentially we are now marking open chunk
in certain way and we rewrite only them. This part makes sense. But to fix edge case we need something more. I propose to let's say:
open
blocks. Rewrite those 100%.compute.write
? Now we will have:
hm still a bit unclear, but IIUC you mean check if a chunk belongs to the head and rewrite when it is open or outside meta.Max
? for all other chunks just fail when outside meta.Maxt
Exactly (:
Today is release day and this PR is ready with only one nit: safeChunk
casting to tell if it's head chunk chunk or not.
I would add TODO
and otherwise merge it and do TSDB update on Prometheus side. Any attempt of refactoring populate
will probably take days to do and review, so I would fix this nit later.
Thoughts? @brancz @brian-brazil @codesome @krasi-georgiev
I ran a quick benchmark which I had ready, and here are the results. And this doesn't have head in compaction.
benchmark old ns/op new ns/op delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8 12470779450 12815518004 +2.76%
benchmark old allocs new allocs delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8 17818580 24843658 +39.43%
benchmark old bytes new bytes delta
BenchmarkCompaction/type=normal,blocks=2,series=1000000,samplesPerSeriesPerBlock=51-8 2112751344 2578539664 +22.05%
Increase in allocs (and also B/op) are alarming for a small change in populateBlock
And confirmed that it is in populateBlock
, I am investigating it.
Great check @codesome thanks, let's remove this warning then @krasi-georgiev
:+1:
fixes: https://github.com/prometheus/tsdb/issues/518 Chunks that are still being appended to might return corrupted data when calling
chunk.Bytes()
that is why incomplete chunks should never be persisted. This is relevant mainly when snapshotting with the head included.The current snapshot test is sufficient to ensure that incomplete chunks(being appended to) are ignored.
Had this idea while reviewing https://github.com/prometheus/tsdb/pull/639