attic-labs / noms

The versioned, forkable, syncable database
Apache License 2.0
7.45k stars 266 forks source link

Don't write a sequence chunk if there is no parent #3699

Closed kalman closed 6 years ago

kalman commented 6 years ago

In most cases this will avoid writing the root chunk of a prolly tree, which is the behavior we're aiming for: a prolly tree might be used inline in which case the root never needs to be written.

The solution in this patch is imperfect because it may unnecessarily write chunks, but this is rare.

Fixes https://github.com/attic-labs/noms/issues/3645

ghost commented 6 years ago

Thinking about this solution this way actually makes me feel better about it. Basicallly there are two cases here to worry about:

1) Upon completion, we want to avoid persisting the resulting chunk from the root elements 2) It's possible that N levels below the top chunker will have randomly chunked on the first item and we'll normalize by walking down. In this case, we'd prefer to have avoided the canonical root chunk.

This patch completely handles (1), but doesn't address (2). let's just open a bug for (2) and fix it...probably never.

kalman commented 6 years ago

OK, ptal. Comments addressed, plus I did a bit of cleanup from the previous attempt at solving this problem.

I'm not sure that passing in a bool to createSequence (new code) is better than checking for the presence of a parent (old code), because we still need to know when to write any unwritten collection, which is when createParent is called.

I.e. I find it more direct seeing "if there is a parent, write the sequence" and "when there is a new parent, write an unwritten sequence" vs looking at call sites of createSequence to make that connection.

Arguments either way.

kalman commented 6 years ago

I'm going to do it separately so we can discuss separately.

cmasone-attic commented 6 years ago

For what it's worth, I actually tried initializing sc.current to some non-zero size and it didn't change the profile much. Re-using the storage is the real win.