Closed wohali closed 5 years ago
I know nothing of the CouchDB internals and basically nothing of Erlang. Question:
Why go through the DBs sequentially like that? We have a custom script to handle our database compaction right now. Step one is get every DB and figure out how much space compaction would save. Step two is start with the largest space saving DB and compact it. Repeat step two over the remaining DBs until allowed time window expires.
The advantage being every day we target the DBs that need it most.
@regner loading the db info for every db at once is prohibitively expensive on disk and RAM, especially on busy clusters where doing so screws up both disk caching and our LRU cache.
Our couch fs crawler is deterministic, and it's good we don't have to load it all into memory at once. Since we need to keep that mechanism, saving our current work point when the autocompactor is paused, then resuming from that point is a simple way to solve the problem at hand without destroying performance.
That makes perfect sense. Thanks for the explanation. :)
I vote for this issue. I have the exact same problem than @regner in #1579 on a server runnng 2.3.0 with about 10k databases and the following configuration for compaction daemon:
check_interval = 300
min_file_size = 10485760
snooze_period = 1
_default = [{db_fragmentation, "50%"}, {view_fragmentation, "40%"}, {from, "20:00"}, {to, "08:00"}]
some databases, and especially _global_changes
are never compacted so disk use increases until I have to manually trigger a compaction on _global_changes
with the _compact
endpoint, which free dozens of gigabytes.
@sblaisot we don’t take votes here, but patches are always welcome :)
I know but knowing how much users are impacted can change priorization ;)
We have a new compaction daemon in CouchDB 3.0 that should avoid this problem since it runs all the time and has finer grained tuning options. Closing this out for now.
See #1579.
In the default setup, with a short window for compaction outside of business hours, a user with 130k databases, the compaction daemon never gets past the first ~2000 databases or so.
It would be helpful if the compaction daemon could bookmark the last database it got to during its last loop, so that future runs (the next day after business hours) could resume in the sequence where compaction was left off.
This wouldn't have to be kept through restarts, in my opinion, but it could be for niceness.
/cc @janl @nickva