rocicorp / repc

The canonical Replicache client, implemented in Rust.
Other
31 stars 7 forks source link

Implement Garbage Collection #208

Closed arv closed 3 years ago

arv commented 3 years ago

We use ref counting to implement garbage collection. This is done at the dag::Write layer. We do the garbage collection in dag::Write::commit.

When we call set_head we record the new and old hash so that we can increment and decrement the ref counts later in commit.

When we call put_chunk we record the hash so that we can check for orphaned chunks in commit.

Fixes #34

aboodman commented 3 years ago

https://i.imgflip.com/1qx7x7.jpg

On Fri, Oct 9, 2020 at 1:48 PM Erik Arvidsson notifications@github.com wrote:

@arv https://github.com/arv requested your review on: #208 https://github.com/rocicorp/repc/pull/208 Implement Garbage Collection.

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/rocicorp/repc/pull/208#event-3862112199, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAATUBB2ZGOZWGUMUWAXQADSJ6OMRANCNFSM4SKUWOVQ .

aboodman commented 3 years ago

But to be clear let's just try the two options for now: put this in the meta key, or put it on its own.

aboodman commented 3 years ago

Sorry now that I think about this a few more moments:

arv commented 3 years ago

Size numbers:

master:

replicache_client.js: 24484
replicache_client.js.br: 4532
replicache_client_bg.wasm: 527419
replicache_client_bg.wasm.br: 152207

ref-counting:

replicache_client.js: 24484
replicache_client.js.br: 4548
replicache_client_bg.wasm: 549671
replicache_client_bg.wasm.br: 158369

Perf numbers:

master:

populate 1024x1000 (clean) x 0.80 MB/s ±0.0% (0 runs sampled)
populate 1024x1000 (dirty) x 0.80 MB/s ±0.0% (0 runs sampled)
scan 1024x1000 x 20.57 MB/s ±0.0% (0 runs sampled)
scan 1024x5000 x 21.15 MB/s ±0.0% (0 runs sampled)

ref-counting:

populate 1024x1000 (clean) x 0.75 MB/s ±0.0% (0 runs sampled)
populate 1024x1000 (dirty) x 0.74 MB/s ±0.0% (0 runs sampled)
scan 1024x1000 x 19.28 MB/s ±0.0% (0 runs sampled)
scan 1024x5000 x 21.06 MB/s ±0.0% (0 runs sampled)
arv commented 3 years ago

Not clear what the clippy error is about. Does not repro locally.

arv commented 3 years ago

I'll put the count in meta next week. Based on my earlier work it is a lot more complicated...

aboodman commented 3 years ago

You have to run clippy with warnings.

arv commented 3 years ago

image

arv commented 3 years ago

I created another branch where the ref count is in the meta data.

Perf:

populate 1024x1000 (clean) x 0.77 MB/s ±0.0% (0 runs sampled)
populate 1024x1000 (dirty) x 0.71 MB/s ±0.0% (0 runs sampled)
scan 1024x1000 x 19.15 MB/s ±0.0% (0 runs sampled)
scan 1024x5000 x 21.58 MB/s ±0.0% (0 runs sampled)

https://github.com/arv/repc/pull/1 for reference