Open Li0k opened 1 year ago
How will you implement write stall? By adding lags?
How will you implement write stall? By adding lags?
We have already implemented WriteLimit, and we can achieve write stall by blocking flush, this issue wants to further discuss its limitations.
@Li0k Any updates?
@Li0k Any updates?
No, I'm considering to suspending this pr
I think we can bring this issue back and see how we can optimize the write limit conditions. We have seen several cases when writes are not stalled even though there are many overlapping sub-levels. This can happen when to base compaction and tier compaction are stuck while intra-L0 compaction can still proceed.
This issue has been open for 60 days with no activity.
If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity
label.
You can also confidently close this issue as not planned to keep our backlog clean. Don't worry if you think the issue is still valuable to continue in the future. It's searchable and can be reopened when it's time. 😄
Is your feature request related to a problem? Please describe.
In Hummock, excessive sst can lead to read performance degradation and deteriorate compactor performance, so we introduced the concept of
WriteLimiter
to implement a Write stall for excessive writes.The system will use
wait_permission
to determine if the flush condition is satisfied at each flush. When the write limit is in effect, it will cause a write stall in Hummock and affect the upstream operators through backpressure, thus relieving the pressure of sst pileup in Hummock.In the current implementation, we only use
level0_stop_write_threshold_sub_level_number
as the write limit condition, and the default value is 1000, which is quite lenient. In some recent scenarios, we have found that this limit has a certain lag time and cannot limit writes in time, leading to further deterioration of the lsm state.Describe the solution you'd like
The purpose of introducing WriteLimit is to prevent the shape of Lsm from becoming abnormal, which in turn leads to a reduction in read performance and Compactor efficiency. Therefore, based on the above reasons, we can introduce more constraints:
Describe alternatives you've considered
No response
Additional context
No response