dscarpetti / codax

An idiomatic transactional embedded database for clojure
Eclipse Public License 1.0
179 stars 9 forks source link

Increasing file size #23

Closed Frozenlock closed 6 years ago

Frozenlock commented 6 years ago

I've noticed that the codax file is increasing in size even when the stored map is small. It seems related to the number of manipulations.

A small map (~100kb when in a text file) will give a nodes file bigger than 10Mb after a few dozens of overwrites.

dscarpetti commented 6 years ago

Depending on the shape of the data that isn't too surprising. Automatic compaction is triggered every 10,000 updates on an open database. Unfortunately, if the database is closed and reopened the counter resets. While this design works well enough for long running server systems, it is clearly insufficient for otherwise good use cases.

For the time being, it is possible trigger database compaction manually by calling the codax.store/compact-database function on an open database instance. It should reduce the file size back to something on the order of the expected 100k. Please let me know.

I will probably add the manual compaction function as compact-database! to the core namespace in a minor update. Though that is a bit of a half measure as I would prefer the database do a better job of managing it's bookkeeping without direct user intervention. The current setup was designed to lower compaction overhead on large databases, but the compaction time for small databases is insignificant. The long term solution will probably be to automatically adjust the compaction interval with database size.

Frozenlock commented 6 years ago

Thank you for the very detailed answer!

The counter reset explains why no compaction was taking place. (The database is only used for a few hundreds of writes at a time.)

I will use codax.store/compact-database manually until there's some automatic compacting.

Thanks again!

Frozenlock commented 6 years ago

I don't know if you want to let this issue open as a reminder, so I'll leave it up to you.

dscarpetti commented 6 years ago

Happy to help! And thanks for opening the issue.

I'll leave it open until a better solution is ready, both as a reminder and as a reference for anyone else experiencing the same problem.

dscarpetti commented 6 years ago

I released version 1.3.0 which should fix this issue. Compaction both happens much more frequently on small datasets and will no longer be delayed by closing the database.

While I don't expect the codax.store/compact-database function to go away anytime soon, it isn't really a part of the public API, so I wouldn't recommend calling it directly moving forward.

Frozenlock commented 6 years ago

Awesome, thank you very much!