Open prasad-reddyk opened 4 years ago
Currently, the bookkeeper uses a lazy garbage collection mechanism to reclaim disk spaces after ledgers/segments are deleted. So there will be a gap between ledgers are deleted and the disk space is reclaimed. So please use large disk capacity if you are sending fair amount of traffic to the cluster. This allows the bookie garbage collector has enough time to reclaim the disk space.
We are experiencing a similar issue, with huge disk space usage even when we explicitly delete all the topics. Normally, we limit both the backlog quota and the retention period. However, it does not seem to have much impact. Is there any way to trigger garbage collection or any other way to reclaim disk space or set a bound on it? Thanks
@orbang Can you describe more about what you are experiencing?
@prasad-reddyk @orbang As sijie mentioned, how about try to set the gc to be more frequently in bookkeeper.conf? The default gc time is 1 hour and 1 day:
# Threshold of minor compaction
# For those entry log files whose remaining size percentage reaches below
# this threshold will be compacted in a minor compaction.
# If it is set to less than zero, the minor compaction is disabled.
minorCompactionThreshold=0.2
# Interval to run minor compaction, in seconds
# If it is set to less than zero, the minor compaction is disabled.
minorCompactionInterval=3600
# Threshold of major compaction
# For those entry log files whose remaining size percentage reaches below
# this threshold will be compacted in a major compaction.
# Those entry log files whose remaining size percentage is still
# higher than the threshold will never be compacted.
# If it is set to less than zero, the minor compaction is disabled.
majorCompactionThreshold=0.5
# Interval to run major compaction, in seconds
# If it is set to less than zero, the major compaction is disabled.
majorCompactionInterval=86400
@jiazhai , Currently i set one minute of gcWaitTime as recommended bu Sijie. Please Let us know if need to change anthing around these parameter values. Thanks !!
# How long the interval to trigger next garbage collection, in milliseconds
# Since garbage collection is running in background, too frequent gc
# will heart performance. It is better to give a higher number of gc
# interval if there is enough disk capacity.
gcWaitTime=60000
# If it is set to less than zero, the minor compaction is disabled.
minorCompactionThreshold=0.2
# Interval to run minor compaction, in seconds
# If it is set to less than zero, the minor compaction is disabled.
# Note: should be greater than gcWaitTime.
minorCompactionInterval=3600 //1 Hour
I am affected by this also - we're trying to reclaim disk for two reasons 1) Security / customer requirements to NOT retain data 2) Cost
What I have done so far inspired by @addisonj
i) Lowered majorCompactionInterval
from 86400 (1 day) to 7200 (2 hours)
ii) Lowered managedLedgerMaxLedgerRolloverTimeMinutes
from 240 (4 hours) to 90 (1.5 hours)
Open question on whether I should change majorCompactionThreshold
from the (Pulsar) default of 0.5 as I don't fully understand if lower values mean more or less disk space usage.
Thanks in advance
Describe the bug The bookkeeper ledger no honoring the retention period and causing for the more disk usage. During my test execution i set defaultRetentionTimeInMinutes is 6 and noticed that bookie ledger is no honoring the retention period and eating up more disk space and causing for failures due to "Not enough non-faulty bookie available". When drilled down to disk and see, most of message are retaining even after retention period is completed in ledger disk. The below are the list of broker paramter settins
broker: configData: managedLedgerMinLedgerRolloverTimeMinutes: "1" managedLedgerCursorRolloverTimeInSeconds: "60" managedLedgerMaxLedgerRolloverTimeMinutes: "2" managedLedgerMaxEntriesPerLedger: "50000" defaultRetentionTimeInMinutes: "6" defaultRetentionSizeInMB: "1000" ttlDurationDefaultInSeconds: "3600"
To Reproduce Steps to reproduce the behavior:
Expected behavior As per retentionperiod the data should get flushed out and make disk space free
Screenshots n/a
Desktop (please complete the following information):
Additional context Currently we have 100 GB for each bookie Below is the disk calculation as per my messages/sec. a=msg/sec -844.023/sec=844.02350 kb =42201.2 msgs/sec in KB b=aretention period = 42201.2*360(6 minutes)=15192414 msg/sec =14.4 GB (3 bookies) is required for 6 minutes duration. c= 100 GB each bookie (300 GB for 3 bookies) so b<c
as per above calculation. i have enough disk space but still facing issues