Closed qkrorlqr closed 6 months ago
Use blob patching in compaction
won't use patching - it's too risky, we still keep finding bugs in it 2.5 years after it was first introduced
found a problem: deletion markers are not deleted from the FreshBytes table upon FlushBytes/TrimBytes
marker added upon range deletion: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/tablet_state_data.cpp#L190 marker written to db: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/tablet_state_data.cpp#L323 calling DeleteBytes: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/model/fresh_bytes.cpp#L122 DeleteBytes only deletes stuff: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/model/fresh_bytes.cpp#L26 => the marker can't be found anywhere and thus cannot be deleted from the db
Found a performance issue: Compaction copies blocks which are not actually needed but whose MaxCommitId is still > than MinCommitId (CommitIds are not rebased yet) to its dst blobs. Second compaction run is needed to get rid of this garbage.
UPD: fixed it here https://github.com/ydb-platform/nbs/pull/729
found another problem: CompactionThreshold is set in terms of blob count, max blob size is set in bytes (and the default is reasonable - 4MiB) which leads to nonstop recompactions for filesystems with large block sizes
e.g. BlockSize=128KiB, MaxBlobSize=4MiB, CompactionThreshold=20 if we have a "full" compaction range (having 64 blocks x 16 nodes == 1024 blocks) there will be 128MiB of data in it => 32 blobs which is greater than 20 which leads us to endless compactions
a possible solution is adjusting this threshold based on the BlockSize / DefaultBlockSize ratio (DefaultBlockSize == 4KiB => we will have min 1 blob per range if there is some data in this range - it's our baseline)
UPD: fixed
we also need some metrics showing the number of compactions per second split by the reasons - launches per second due to high blob count per range / due to high overall blob count / due to high garbage percentage
and outputting current garbage percentage and average blob count per range to the tablet monpage would also be nice
From the description of this issue:
- Implement lazy compaction map loading
Done
- Implement garbage-based compaction
Done
- Use blob patching in compaction
Postponed
- Think about some ways to optimize Cleanup and FlushBytes and the FreshBytes layer
TODO
- Global triggers for Compaction and Cleanup - triggers for total deletion marker count per fs, total blob count per fs, etc.
Done
- Add metrics for background ops, display more background ops info on the tablet monpage
Done
Also some bugs were found during the work on this issue. Some of them were fixed.
Current TODO list for this issue:
Current TODO list for this issue:
- Missing deletion marker cleanup in FlushBytes (bug)
DONE
- Cleanup/FlushBytes/FreshBytes layer optimization - it's actually a candidate for a separate issue
will increase Compaction/Cleanup throughput here: https://github.com/ydb-platform/nbs/issues/1129, will work on the fresh bytes layer sometime later
- Optimize CompactionMap load times