MatrixAI / js-db

Key-Value DB for TypeScript and JavaScript Applications
https://polykey.com
Apache License 2.0
5 stars 0 forks source link

Introduce Snapshot Isolation OCC to DBTransaction #18

Closed CMCDragonkai closed 2 years ago

CMCDragonkai commented 2 years ago

Description

This implements the snapshot isolation DB transaction.

This means DBTransaction will be automatically snapshot isolated, which means most locking will be unnecessary.

Instead when performing a transaction, there's a chance for a ErrorDBTransactionConflict exception which means there was a write conflict with another transaction.

Users can then decide on their discretion to retry the operation if they need to (assuming any non-DB side-effects are idempotent, noops or can be compensated). This should reduce the amount of locking overhead we need to do in Polykey. We may bubble up the conflict exception to the user, so the user can re-run their command, or in some cases, in-code we will automatically perform a retry. The user in this case can be the PK client, or the another PK agent or the PK GUI.

There is still one situation where user/application locks are needed, and that's where there may be a write-skew. See snapshot isolation https://en.wikipedia.org/wiki/Snapshot_isolation for more details and also https://www.cockroachlabs.com/blog/what-write-skew-looks-like/.

In the future we may upgrade to SSI (serializable snapshot isolation) which will eliminate this write-skew possibility.

Additionally this PR will also enable the keyAsBuffer and valueAsBuffer options on the iterators, enabling easier usage of the iterators without having to use dbUtils.deserialize<T>(value) where it can be configured ahead of time.

Issues Fixed

Tasks

Final checklist

CMCDragonkai commented 2 years ago

@emmacasolin @tegefaulkes watch this PR!

CMCDragonkai commented 2 years ago

Now I have an interesting test to demonstrate the snapshot's capabilities.

import { withF, ResourceRelease } from '@matrixai/resources';
import { DB } from './src';
import * as testsUtils from './tests/utils';

async function main () {

  const key = await testsUtils.generateKey(256);

  const db = await DB.createDB({
    dbPath: './tmp/db',
    crypto: {
      key,
      ops: { encrypt: testsUtils.encrypt, decrypt: testsUtils.decrypt },
    },
    fresh: true
  });

  await db.put('key', 'value');

  const t1 = withF([db.transaction()], async ([tran]) => {
    console.log('T1 START');
    await testsUtils.sleep(100);
    console.log('T1: LOADING KEY2', await tran.get('key2'));
    console.log('T1 FINISH');
  });

  // T2 starts after T1 starts
  const t2 = withF([db.transaction()], async ([tran]) => {
    console.log('T2 START');
    await tran.put('key2', 'value2');
    console.log('T2 FINISH');
  });

  await t2;
  await t1;

  const t3 = withF([db.transaction()], async ([tran]) => {
    console.log('T2 START');
    console.log('T3: LOADING KEY2', await tran.get('key2'));
    console.log('T3 FINISH');
  });

  await t3;

  await db.stop();

}

main();

In master the output would be:

T1 START
T2 START
T2 FINISH
T1: LOADING KEY2 value2
T1 FINISH
T2 START
T3: LOADING KEY2 value2
T3 FINISH

With the new snapshot iterator, the result is:

T1 START
T2 START
T2 FINISH
T1: LOADING KEY2 undefined
T1 FINISH
T2 START
T3: LOADING KEY2 value2
T3 FINISH

With SI, because T1 and T2 are started at the same time, they get the original state of the database, and T2 commits before T1 reads the key, but because T1 still has the original snapshot, it continues to read undefined.

CMCDragonkai commented 2 years ago

To allow future debugging, I've added debug logging statements for DBTransaction state transitions, creating, destroying, committing, rollbacking, finalizing.

CMCDragonkai commented 2 years ago

I just realised it's not possible to get a derived iterator for DBTransaction.iterator based on the snapshot iterator. This is simply because if the option is reverse: true, this wouldn't be applied to the snapshot iterator, therefore it couldn't be used.

That will mean, that until we can have access to raw iterators (https://github.com/Level/leveldown/issues/486#issuecomment-1116170897), if users want to have a consistent iterators with respect to the transactional snapshot, the iterators must be created at the beginning of the transaction. Any subsequent creation would take the snapshot of the DB state at that point taking into account any changes to the DB state by other committed transactions. I'm not fully sure if this is the case because createTransaction is asynchronous. We may need to do some tests to see if is possible to intercept another asynchronous operation in between starting the transaction and creating the iterator.

We can do a synchronous constructor call instead if not.

Anyway this should mean that consistent iteration can only be done with:

withF([db.transaction()], async (tran) => {
  const i1 = tran.iterator();
  const i2 = tran.iterator();
  // do work afterwards
});
CMCDragonkai commented 2 years ago

I've found a scenario where creating an iterator up front is still not consistent. This may mean we need to change the way we construct our transaction (to be more synchronous).

import { withF } from '@matrixai/resources';
import { DB } from './src';
import * as testsUtils from './tests/utils';

async function main () {

  const key = await testsUtils.generateKey(256);

  const db = await DB.createDB({
    dbPath: './tmp/db',
    crypto: {
      key,
      ops: { encrypt: testsUtils.encrypt, decrypt: testsUtils.decrypt },
    },
    fresh: true
  });

  await db.put('key', 'value');

  const p = db.put('key', 'value3');

  const t1 = withF([db.transaction()], async ([tran]) => {
    const i1 = tran.iterator({ keyAsBuffer: false, valueAsBuffer: false });

    console.log('CREATED ITERATOR');

    i1.seek('key')
    const v1 = (await i1.next())![1];
    await i1.end();

    const v2 = await tran.get('key');

    console.log(v1, v2, v1 === v2);
  });

  await p;
  await t1;

  await db.stop();
}

main();

Here we see that the newly created iterator gets value3, whereas the snapshot iterator gets value. This is not consistent.

CMCDragonkai commented 2 years ago

There's an old fork of leveldown that did expose snapshots to the interface: https://github.com/frankboehmer/leveldown/commits/master

CMCDragonkai commented 2 years ago

Ok by making DBTransaction construction fully synchronous, we do not have the above problem. However the above problem can still arise, as long as there is a time-difference between constructing the transaction, and performing the first non-snapshot iterator creation.

      const tran = new DBTransaction({
        db: this,
        transactionId,
        logger: this.logger,
      });

This can happen as long as you do withF([db.transaction(), someOtherResource()], async ([tran]) => { /* ... */ });. Simply because the withF will do async work in between acquiring the resource and running the transaction.

A different foolproof trick would be delay the main snapshot iterator creation until the first get or iterator call is made. This would guarantee that we would have had to be in the resource handler callback when the first snapshot iterator is created.

But it also means that snapshot iterator lifecycle is not tied directly to transaction lifecycle. It also makes snapshot creation lazy, in that if a transaction only does put or del, it does not bother creating that initial consistent snapshot. This can be a bit confusing for end-users to reason about when consistent snapshot begins.

Instead one can trigger snapshot creation as soon as get, put, del, or iterator method is called. That way, it is at least observable that only when the tran is actually interacted with does the consistent lifecycle start.

The best solution is for leveldb to expose snapshots directly and for that we need to work in C++ and fork the leveldb for js-db.

CMCDragonkai commented 2 years ago

Expecting iterators to be created up front is not going to be good for usability. Consider that in PK our transactions may be created high the call graph, and the context may be passed down to many domain methods. It is therefore important that downstream users can expect consistency when creating their iterators. Maybe it's time to start the C++ journey!

CMCDragonkai commented 2 years ago

From https://github.com/google/leveldb/blob/main/doc/index.md#snapshots.

In C++, snapshots are used like this:

leveldb::ReadOptions options;
options.snapshot = db->GetSnapshot();
// ... apply some updates to db ...
leveldb::Iterator* iter = db->NewIterator(options);
// ... read using iter to view the state when the snapshot was created ...
delete iter;
db->ReleaseSnapshot(options.snapshot);

Here we can see that even after mutating the db, just simply creating the iterator with db->NewIterator(options) passes the snapshot in the ReadOptions struct (I'm guessing it's a struct), and ensures that the iterator reads from the same snapshot.

This basically means that read operations using get or iterator can make use of the same snapshot.

This means the final API might like this:

// Internally inside `DBTransaction`
const snapshot = db.snapshot();
const iterator = db.iterator({ snapshot }, levelPath);

// Externally at the end-user
await withF([this.db.transaction()], async ([tran]) => {
  const i1 = tran.iterator();
  const i2 = tran.iterator();
  // we can expect that `i1` and `i2` are consistent with the transactional snapshot iterator
});

The snapshot option is passed in automatically into our AbstractIteratorOptions. We may extend this type to also take the snapshot object.

CMCDragonkai commented 2 years ago

The forked implementation of snapshots https://github.com/frankboehmer/leveldown/commit/a78ddc1016297dbe20703b685fce2aba256fd9b3 done in 2018 used the NAN mechanism. This was an older way of creating native addons: https://stackoverflow.com/a/59440434/582917. The newer way uses NAPI, so we will have to study both and compare what will be needed.

CMCDragonkai commented 2 years ago

The lmdb-js is an interesting solution that also has transaction support, however we know leveldb well enough that we can't pursue this alternative yet https://github.com/MatrixAI/js-db/commit/6107d9c3ac55767113034bcedd19b379a5181a1d#r73267460.

CMCDragonkai commented 2 years ago

The lmdb-js project if we switch to that, could potentially solve a number of issues:

It would be interesting to explore this in the future, since that would indicate alot of work, and we still have to compare with potentially using rocksdb instead.

CMCDragonkai commented 2 years ago

@emmacasolin I had a look at this current branch after I rebased on top of master.

I see that the package-lock.json wasn't updated on the last release.

You should make sure that npm install was performed with the necessary package changes.

Also I think you removed a necessary devDependency being of @types/node-forge.

Can you create another feature branch, enter nix-shell, add back @types/node-forge and do a full npm install which should update package-lock.json as well. And then merge, and do another publish to 3.3.4 (generate docs).

CMCDragonkai commented 2 years ago

Closed in favour of #19. That's now the new feature branch. I'll be bringing in the changes from TypeScript-Demo-Lib there too.