replikativ / hitchhiker-tree

Functional, persistent, off-heap, high performance data structure
Eclipse Public License 1.0
44 stars 19 forks source link

Tracing GC for konserve #9

Closed csm closed 3 years ago

csm commented 4 years ago

Based partially on https://github.com/datacrypt-project/hitchhiker-tree/pull/24. Rewritten to primarily use core.async.

Would need something like https://github.com/replikativ/konserve/pull/29 to iterate keys in the store.

whilo commented 4 years ago

That is already considerably simpler. It would be ideal to collect all hitchhiker-tree marks in a set and then return it. That way we can add additional keys to mark (e.g. for the roots of all Datahike branches or maybe other things in store) and call a sweep function separately with the marks set. Sorry, I should have been more explicit about this. It would also be nice if the collection of marks would work with core.async as well (without <??), but we can fix that later.

whilo commented 4 years ago

The mark phase is a pure function because the tree roots we are giving to it are constants and the store is immutable then. We just happen to read constants from disk. That way we separate it from the side-effects of removing the stale nodes.

whilo commented 3 years ago

@csm Hey. We have now implemented this with the help of metadata support in our key-value store protocol, which does not clutter the value semantics with timestamps. Thanks a lot for pushing the GC and if you have any further needs, feel free to reach out or open more PRs.