ydb-platform / nbs

Network Block & File Store
Apache License 2.0
58 stars 21 forks source link

[Filestore] Compaction/Cleanup/FlushBytes optimization #95

Closed qkrorlqr closed 6 months ago

qkrorlqr commented 10 months ago
qkrorlqr commented 9 months ago

Use blob patching in compaction

won't use patching - it's too risky, we still keep finding bugs in it 2.5 years after it was first introduced

qkrorlqr commented 9 months ago

found a problem: deletion markers are not deleted from the FreshBytes table upon FlushBytes/TrimBytes

marker added upon range deletion: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/tablet_state_data.cpp#L190 marker written to db: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/tablet_state_data.cpp#L323 calling DeleteBytes: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/model/fresh_bytes.cpp#L122 DeleteBytes only deletes stuff: https://github.com/ydb-platform/nbs/blob/a57aabbace4a8993881e5b300598f43ee3873877/cloud/filestore/libs/storage/tablet/model/fresh_bytes.cpp#L26 => the marker can't be found anywhere and thus cannot be deleted from the db

qkrorlqr commented 8 months ago

Found a performance issue: Compaction copies blocks which are not actually needed but whose MaxCommitId is still > than MinCommitId (CommitIds are not rebased yet) to its dst blobs. Second compaction run is needed to get rid of this garbage.

UPD: fixed it here https://github.com/ydb-platform/nbs/pull/729

qkrorlqr commented 6 months ago

found another problem: CompactionThreshold is set in terms of blob count, max blob size is set in bytes (and the default is reasonable - 4MiB) which leads to nonstop recompactions for filesystems with large block sizes

e.g. BlockSize=128KiB, MaxBlobSize=4MiB, CompactionThreshold=20 if we have a "full" compaction range (having 64 blocks x 16 nodes == 1024 blocks) there will be 128MiB of data in it => 32 blobs which is greater than 20 which leads us to endless compactions

a possible solution is adjusting this threshold based on the BlockSize / DefaultBlockSize ratio (DefaultBlockSize == 4KiB => we will have min 1 blob per range if there is some data in this range - it's our baseline)

UPD: fixed

qkrorlqr commented 6 months ago

we also need some metrics showing the number of compactions per second split by the reasons - launches per second due to high blob count per range / due to high overall blob count / due to high garbage percentage

and outputting current garbage percentage and average blob count per range to the tablet monpage would also be nice

qkrorlqr commented 6 months ago

From the description of this issue:

  • Implement lazy compaction map loading

Done

  • Implement garbage-based compaction

Done

  • Use blob patching in compaction

Postponed

  • Think about some ways to optimize Cleanup and FlushBytes and the FreshBytes layer

TODO

  • Global triggers for Compaction and Cleanup - triggers for total deletion marker count per fs, total blob count per fs, etc.

Done

  • Add metrics for background ops, display more background ops info on the tablet monpage

Done

Also some bugs were found during the work on this issue. Some of them were fixed.

Current TODO list for this issue:

qkrorlqr commented 6 months ago

Current TODO list for this issue:

  • Missing deletion marker cleanup in FlushBytes (bug)

DONE

  • Cleanup/FlushBytes/FreshBytes layer optimization - it's actually a candidate for a separate issue

will increase Compaction/Cleanup throughput here: https://github.com/ydb-platform/nbs/issues/1129, will work on the fresh bytes layer sometime later

  • Optimize CompactionMap load times

https://github.com/ydb-platform/nbs/issues/1128