If compaction fails (i.e., running out of space), the temporary files generated will be left on disk, and therefore they will not be removed and occupy the space. We should remove them if compaction fails.
~good_first_issue~ Initial implementation ideas
test_runner/regress/test_duplicate_layers.py::test_actually_duplicated_l1 creates a similar situation as what this issue is after fixing but instead of exiting pageserver after first image layer has been written out, we want to return an error and have all the layers which have not made it to layermap be deleted when exiting the scope. This is not currently implemented, and the way the pageserver::tenant::storage_layer::Layer is implemented right now is too deep in the DeltaLayerWriter. If the rename would happen later, we would have no problem.
The needed structure is not however trivial, because assuming we have new compacted layers ls which has more than 1 layer. After we rename 1 element to have it's final path, that final path might still need to be deleted in case any of the renames later fail.
This will most likely require changing pageserver::tenant::storage_layer::Layer::finish_creating or at least the outer procedure of "how to get final named paths become layers while creating compacted layers." My initial idea was to instead of having DeltaLayerWriter create the individual Layer values, have it create something like UnfinishedLayer which has the ability of being renamed from temporary file to actual path, but will retain the tempfile alike "delete on drop" until finally moved out to be an actual Layer which gets added to LayerMap.
There is quite the risk of becoming a badly reviewed external contributor PR because this is so open, so not yet adding that label.
If compaction fails (i.e., running out of space), the temporary files generated will be left on disk, and therefore they will not be removed and occupy the space. We should remove them if compaction fails.
~good_first_issue~ Initial implementation ideas
test_runner/regress/test_duplicate_layers.py::test_actually_duplicated_l1
creates a similar situation as what this issue is after fixing but instead of exiting pageserver after first image layer has been written out, we want to return an error and have all the layers which have not made it to layermap be deleted when exiting the scope. This is not currently implemented, and the way thepageserver::tenant::storage_layer::Layer
is implemented right now is too deep in theDeltaLayerWriter
. If the rename would happen later, we would have no problem.The needed structure is not however trivial, because assuming we have new compacted layers
ls
which has more than 1 layer. After we rename 1 element to have it's final path, that final path might still need to be deleted in case any of the renames later fail.This will most likely require changing
pageserver::tenant::storage_layer::Layer::finish_creating
or at least the outer procedure of "how to get final named paths become layers while creating compacted layers." My initial idea was to instead of havingDeltaLayerWriter
create the individualLayer
values, have it create something likeUnfinishedLayer
which has the ability of being renamed from temporary file to actual path, but will retain the tempfile alike "delete on drop" until finally moved out to be an actual Layer which gets added to LayerMap.There is quite the risk of becoming a badly reviewed external contributor PR because this is so open, so not yet adding that label.