Open jbowens opened 1 year ago
Could we do something simple where if the current score of L0 < 1.0, but the total bytes L0 will delete > 10% of the LSM size, we bump up the score to 1.0? For a small LSM with Lbase = L6, this should have the desired aggressive effect. For very large LSMs this will be extremely rare, but 10% space amp isn't bad.
Yeah, I think that's reasonable.
Is there a downside to incorporating (compensated) level size into L0 scoring more generally, eg, maybe as a multiple of LBaseMaxBytes, and taking the MAX of the computed scores?
Is there a downside to incorporating (compensated) level size into L0 scoring more generally, eg, maybe as a multiple of LBaseMaxBytes, and taking the MAX of the computed scores?
Do you mean something like score = L0_compensated_size / (k * LbaseMaxBytes)
? Should be ok too, and more principled. I was being paranoid that there may be some existing workloads with a sub-level count of 1 and > 64MB in L0 that manage to delay L0 => Lbase compactions (and hence reduce write amp), so was trying to scope the re-scoring logic narrowly to the garbage case.
Special level scoring logic applies to L0. None of this logic accounts for potentially high compensated sizes (
PointDeletionBytesEstimate
andRangeDeletionBytesEstimate
). This can result in tombstones deleting an unbounded volume of data lower in the LSM accumulating in L0, preventing the space reclamation.TODO(jackson): Add a detailed outline of L0's existing scoring logic.
Jira issue: PEBBLE-49
Epic CRDB-40361