thesaurus-linguae-aegyptiae / bts

Berlin Text System - Collaborative Editing of Ancient Egyptian Texts and Dictionaries
GNU Lesser General Public License v3.0
15 stars 6 forks source link

Database disk space consumption. #11

Open JKatzwinkel opened 9 years ago

JKatzwinkel commented 9 years ago

This might become an issue much sooner than expected (at least by me), namely on local instances. The current couchDB configuration sets the default _revs_limit value of 1000, and continuous compactions via daemon are not enabled. These settings should possibly be adjusted, as certain local couchDBs are known to consume up to several GB of disk space.

Now triggering compaction of individual collections is supported by lightcouch, but only used on notification collection, still without noticeable effect.

How can we prevent local databases from sprawling and suffering fragmentation due to growing revision histories and dead objects?

cplutte commented 8 years ago

Compaction is used on <notification> because its history is not relevant. Compaction is not automatically used on other db collectoins because this would delete local revision history. However, a user friendly routine to compact other db collections if disc exceeds a certain limit would be very useful.

I suggest to solutions, 1) reduce <_revs_limit> and enable compaction via daemon or trigger compaction within the <_revs_limit> on shutdown, 2) provide additional compaction commands for individual db collections via DBManagerDialaog.

JKatzwinkel commented 8 years ago

Querying revision numbers of objects in frequently updated corpora suggests that finding an appropriate rev_limit value could be tricky. I suspect that greater impact comes from inactive and unreferenced objects, i.e. deleted and orphaned ones.

cplutte commented 8 years ago

let's have a telco on this issue. tuesday 8. ok?

JKatzwinkel commented 8 years ago

Ok!