Open PrayagS opened 7 months ago
How large are your index files?
@GiedriusS The size of index file for fresh blocks from the sidecar is ~2GB.
Once they're compacted into 2d blocks, the index size becomes ~20GB.
Yeah, that might be a problem. Maybe you could do a recursive listing of files and check if you have any no compaction marks inside of your remote object storage?
@GiedriusS Thanks a lot for pointing that out. I had looked around for deletion markers but missed the no compaction markers.
I can see a good amount of my blocks have that marker because the compacted block's index size would exceed 64GB (issue https://github.com/thanos-io/thanos/issues/1424).
What's the fix here? Should I decrease the upload interval of the sidecar from 2h to something less?
@GiedriusS Bumping this issue.
I have recently set up new servers which are functionally sharded so block sizes are much smaller. Blocks with a 2 day range are ~500MB. Not seeing any no-compaction markers as well.
Still I'm seeing this issue where, after creating a bunch of 2d blocks, the next block is a block with a range of 9 days. Shouldn't it be a block with a range of 14 days?
And I've also reached a state where downsampling to 1h resolution is not happening. Screenshot of block state below,
The only warning logs I see are the following two messages which seem unrelated to the issue,
requested to mark for deletion, but file already exists; this should not happen; investigate
empty chunks happened, skip series
The metrics for compaction backlog and downsampling backlog are both zero as of now so it doesn't seem like it's waiting for planned compactions to complete before starting downsampling.
Please let me know if any other data is needed from my end. TIA!
And I've also reached a state where downsampling to 1h resolution is not happening.
Ignore this since it makes sense that downsampling won't start unless the whole stream has had one complete compaction iteration. And that isn't the case here since this is just a subset of the stream.
Seeing irregularity in the compactor's algorithm.
In particular, I see the following inconsistencies,
Version: v0.32.5 Configuration:
Resources assigned:
There are no errors in the log but only warnings which say,
I could find similar issues (https://github.com/thanos-io/thanos/issues/3711, https://github.com/thanos-io/thanos/issues/3721) but no recent activity there.
Attaching a screenshot of the current status of all the blocks,