Open dipack95 opened 5 years ago
just had a quick look:
the wal.Reader tracks the total bytes processed https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/reader.go#L33.
After loading the wal with head.Init you will have the total wal size so far on disk.
after that in wal.Log can add to that total the additional added bytes
so total wal size would be the size when loading the wal + any additional added samples with wal.Log
@codesome @gouthamve what do you think?
Will that still work with compression?
I just double checked, and YES it will work as Reader.total tracks total bytes read from the disk and the decompression happens after that.
and for the wal.Log can track the additional bytes that go in after this: https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L571-L581
Do you think it makes sense to update the WAL size after the pages have been flushed to disk as part of wal.flushPage(..)
?
Why than and not when doing the wal.Log?
There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count. Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..)
call.
There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count.
When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.
Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..) call.
yes you are right here. Best to use n
to populate the total written to disk.
https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L471
When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.
Makes sense.
And then we could subtract the size of newly created block from the WAL size, at this line:
https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/db.go#L459
aaah , wait checkpointing deletes wal files so need to look at the code to see how to handle that as well.
I was unable to find a direct reference to the size of the chunks written to the block on disk. I was thinking maybe we could modify this function to return the number of chunk bytes written, the total number of bytes written (including meta, tombstone, index), and error?
https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/compact.go#L526
This way, we could pass it up the call chain.
compaction hasn't got much relation to WAL size. I think the wal checkpointing is the only blocker here.
The way that I understand it, the data is read from the head
(and in turn the WAL) using the same compact(..)
method, which we must take into account as it removing data from the WAL and writing it to persistent storage. Or have I misunderstood the structure?
I think there is always some duplicate data in the wal and last block. IIRC when loading the wal all samples after the maxt of the last block are ignored.
Okay. Based on this fact, maybe we could adopt the current approach of just reading the data/wal/
directory and using its size? As it currently seems like there a lot of forces in play when it comes to interacting with the WAL.
There would be a way to handle all cases, but it would def be more complicated. I am not strongly against it the DirSize(.. approach and would like to hear what @bwplotka @codesome @gouthamve think.
@krasi-georgiev Gentle nudge :)
Still waiting for some input from @bwplotka @codesome @gouthamve
Looks like the Windows and Linux Travis builds failed due to some permission issues.
restarted the tests.
Why not remove the duplicate DirSize and use this one for the other tests as well?
@krasi-georgiev I can do that once it passes review I suppose.
Considering there was no more activity on this PR, I've followed @krasi-georgiev's suggestion and converted both the WAL and the test code to use a singular DirSize(..)
impl. :)
Signed-off-by: Dipack P Panjabi dpanjabi@hudson-trading.com
Compute the size of the WAL and use it while calculating if exceeding the max retention size limit.
(PR #650 broke somehow, and so this is a new one with the same changes)