attic-labs / noms

The versioned, forkable, syncable database
Apache License 2.0
7.44k stars 266 forks source link

Implement batched writing #3179

Open ghost opened 7 years ago

ghost commented 7 years ago

ATM, writes to a remote server must be staged locally and then all new data gets sent up on commit. This is wasteful in at least two directions and, in particular, slows down sync.

Needs clear design.

Blocked by:

ghost commented 7 years ago

So I think the abstract design for this is basically to "pull" the way NBS works across the wire. That is, and open NBS store is basically a transaction. Before commit (UpdateRoot/Flush) writes are buffered to memtable and then flushed to file tables, the in memory manifest keeps reference to all novel tables, but doesn't update the durable manifest (i.e. such that other readers can see this tx's work) until commit.

The idea with batched writing would be similar. Strawman:

1) writeValue becomes "write a new table, persist it, but don't include it in the manifest", the call would return the name (hash) of the table 2) UpdateRoot/Flush would be required to include the set of novel tables to include in the operation.

There are two challenges to making this work

1) The commit stage needs to ensure validity of the database. This would happen in two parts: a) Individual chunks will be integrity checked. b) During commit, the server must insure completeness of all new values. The approach would be to process the novel tables and use the new lazy completeness validation to ensure there are no dangling references.

2) GC. Since we don't have a design yet for GC, we just need to be sure that the design for batch writing doesn't back us into a corner.

WRT (1), there are some choices around when (a) is done and whether some amount of work for (b) is preprocessed (i.e. for each novel table, keep a record somewhere of unresolved references, so as to avoid commit needing to fetch and process all of the tables)

ghost commented 7 years ago

One thing we should do as a part of this is have the call to write a table return the set of dangling refs (if any). This way, sync can avoid writing a bunch of chunks, and immediately call back with basically the same data asking about hasMany