Open RobQuistNL opened 1 year ago
Another note: The logs state
If problem continues and you do not have any more disk space you can run continue to manually trigger online GC at aggressive thresholds (< 0.01) with
lotus chain prune hot
This tells me that a lower value is more agressive? The other docs tell me a higher value is more agressive..
Checklist
Latest release
, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Repro Steps
HotStoreFullGCFrequency = 1
variable to do prunes as often as possibleDescribe the Bug
After some investigation, I figured out that:
HotStoreMaxSpaceThreshold
is actually defined as: "The maximum size the current hotstore + the potential new copy can occupy on disk"
Not "When HotStoreMaxSpaceTarget is set Moving GC will be triggered when total moving size exceeds HotstoreMaxSpaceTarget - HotstoreMaxSpaceThreshold" (as the docs state)
and
HotstoreMaxSpaceSafetyBuffer
should be defined as "the maximum size the new hotstore can be" instead of "Safety buffer to prevent moving GC from overflowing disk when HotStoreMaxSpaceTarget is set. Moving GC will not occur when total moving size exceeds HotstoreMaxSpaceTarget - HotstoreMaxSpaceSafetyBuffer"Doc issues
The docs state these as defaults:
A node running without these values set will actually have these as defaults:
GC Hot CLI defaults
The docs state we should run
lotus chain prune hot --periodic --threshold 0.00000001
and increase the number. The CLI default is 0.01, not0.00000001
.Apart from that, its never explained what this threshold is. I now know its some magic badgerBS value, but still no idea what I'm actually setting when I change this value.
Default pruned chain examples
When running a node with a pruned chain, and
HotStoreFullGCFrequency = 1
, the first time I'm seeing a GC run, we get those logs. Meaning that the defaults make no sense - a freshly pruned chain will always exceed50000000000
(the new hotstore's expected size is 245681686326)It will also not trigger because the current size (245681686326 + current 448854471748 >= 650000000000)
Apart from these settings it looks like the prune logic doesn't take diskspace into account. I like that we can set our own thresholds, but in my case I just want 2 things:
In my opinion, with using a clearer set of configuration params, we can achieve a nice config setup;
OR
Then we should always know when we're coming close to a point of no return and have to GC.
Default config options could just trigger GC when the system notices we're about to run out of diskspace
Logging Information