MatrixAI / Polykey

Polykey Core Library
https://polykey.com
GNU General Public License v3.0
31 stars 4 forks source link

Fine Grained Concurrency Control Standardisation #294

Closed CMCDragonkai closed 2 years ago

CMCDragonkai commented 2 years ago

Specification

Locking in js-polykey has gone through alot of iterations. The most recent iteration is in js-db usage of locks where fine grained locks provided by async-mutex is used, as well as the RWLock abstraction in src/utils.ts.

For most domains such as ACL, the locks are too coarse-grained, causing one to lock the entire domain itself. Many of these locks should be replaced with the usage of DB transactions as it is done in js-encryptedfs.

In some places, fine grained locks can replace the existing coarse grained locking or even replace the usage of condition variables. Currently the src/network/Connection.ts and derived classes makes use of a _composed boolean which did double duty in terms of indicating when composition was done but as a way to prevented repeated concurrent calls of compose. The first duty is fine, but the second duty should done with a fine grained lock shared between the all the calls that should be blocked when composition operation is occurring. This means all the methods that currently check _composed and throw exceptions when it is not true.

Transactions has been introduced to js-db. With this we can replace a lot of the existing locking with the use of the new db transactions. The general changes that need to be implemented are as follows.

  1. Updating any transact wrappers by removing them or using withF, withG locking directly.
  2. In all cases where there is a conflict just throw it up the stack. We will expect to handle them within the handlers or look deeper into it later.
  3. ErrorDBTransactionConflict Error should never be seen by the user. We should catch and override it with a more descriptive error for the context.
  4. Transactions should be started within the handlers and passed all the way down to where they are needed. The idea is to make each handler attomic.
  5. Concurrency testing should be introduced but only after SI transactions has be implemented.
  6. All usage of DB should be updated to use the new API. This means removing sublevels and utilising LevelPath and KeyPaths instead.
  7. All usage of db streams should be replaced with the db iterator.
  8. All instances of db.put, db.get and db.del should be using transactions via tran.put/get/del

This applies to all domains that make use of DB OR domains that depend on others that make use of DB. The goal here is to make any even starting from the handlers atomic.

There are limitations to this however. Since a transaction can fail if there is overlapping edits between transactions. We can't really include changes to the db that will commonly or guarantee conflict. Example of this are counters or commonly updated fields. So far this has been seen in;

  1. NotificationsManager. Makes use of a counter so any transactions that include Adding or removing a notification WILL conflict. Reads also update metadata so concurrently reading the same message WILL conflict.
  2. More to follow?

Some cases we will need to make use of locking along with a transaction. A good example of this is in the NotificationManager where we are locking the counter update. When this is the case we need to take extra care with the locking. Unless the lock wraps the whole transaction it is still possible to conflict on the transaction. we can't compose operations that rely on this locking with larger transactions.

An example of this problem is.

start tran1
start tran2

start lock
    tran1 update counter
end lock

start lock
    tran2 update counter
end lock

end tran1
end tran2 // conflict!

To avoid this tran2 has to start after tran1 ends. 
We need to force serialisation of the transactions here.
As a result we can't compose with a transaction outside of
the lock.

This means that some operations or domains can't be composed with larger transactions. It has yet to be seen if this will cause an issue since more testing is required to confirm any problem. I suppose this means we can't mix pessimistic and optimistic transactions. So far it seems it will be a problem with the following domains.

  1. Vaults domain - Lifecycle and editing of vaults relies heavily on locks.
  2. NotificationsManager - Locking is needed for the counter and preventing other conflicts.

Note that this has nothing to do with IPC locking as in #290.

Additional Context

Tasks

  1. Updating any transact wrappers by removing them or using withF, withG locking directly.
  2. In all cases where there is a conflict just throw it up the stack. We will expect to handle them within the handlers or look deeper into it later.
  3. ErrorDBTransactionConflict Error should never be seen by the user. We should catch and override it with a more descriptive error for the context.
  4. Transactions should be started within the handlers and passed all the way down to where they are needed. The idea is to make each handler attomic.
  5. Concurrency testing should be introduced but only after SI transactions has be implemented.
  6. All usage of DB should be updated to use the new API. This means removing sublevels and utilising LevelPath and KeyPaths instead.
  7. All usage of db streams should be replaced with the db iterator.
  8. All instances of db.put, db.get and db.del should be using transactions via tran.put/get/del
CMCDragonkai commented 2 years ago

The development of the RWLock system here https://gist.github.com/CMCDragonkai/4de5c1526fc58dac259e321db8cf5331 may be usable by a number of domains to increase their concurrency.

CMCDragonkai commented 2 years ago

The RWLock system has been improved with the ability to do read-preferring or write-preferring.

The async-init has been upgraded to use locking for their start, stop, destroy methods. This makes it alot better now, they are using write-preferring rwlock.

There was some discussion about locking abstractions, the usage of a generic with since alot of domains are using a sort of transact or withTransaction method and this can be made unto a generic withF or withG utility function that also avoids the need the nest a bunch of transact callbacks.

I'm just trying to find where I wrote that.

CMCDragonkai commented 2 years ago

The commentary about a general with is here: https://github.com/MatrixAI/js-polykey/pull/310#issuecomment-1001820924

It came out of discussion about general resource management which includes locking.

CMCDragonkai commented 2 years ago

Attempting to implement withF combinator. One of the problems is avoiding lots of overloaded type signatures. I originally thought I'd have to implement it similar to Promise.all. But it turns tuple types are spreadable in TS 4.0.

However I also need to map & index into the tuple type before spreading. This is a bit more complicated, had to ask a SO question about this: https://stackoverflow.com/questions/70736753/how-to-map-index-into-tuple-types-that-is-generically-spread-in-typescript

CMCDragonkai commented 2 years ago

Initial attempt, however the types are not correct due to f callback still taking a spread of the resources, we need a magic TC to convert Resources to the second element of the return type of each resource.

type Resource<T = void> = () => Promise<[
  release: () => Promise<void>,
  resource: T
]>;

async function withF<
  Resources extends Array<Resource<unknown>>,
  Result
>(
  resources: Resources,
  f: (...resources: [...Resources]) => Promise<Result>
): Promise<Result> {
  const releases = [];
  const items = [];
  try {
    for (const resource of resources) {
      const [release, item] = await resource();
      releases.push(release);
      items.push(item);
    }
    return await f(...items);
  } finally {
    releases.reverse();
    for (const release of releases) {
      await release();
    }
  }
}
CMCDragonkai commented 2 years ago

It appears one of the problems is that mapped types don't seem to work correctly for tuple types.

For example:

type Api = {
  createItem: (name: string) => Promise<any>;
  getItem: (id: number) => Promise<any>;
  updateItem: (item: any) => Promise<any>;
  deleteItem: (id: number) => Promise<void>;
};

// type Api = [() => number, () => string];

type NewApi = {
  [K in keyof Api]: ReturnType<Api[K]>
};

The above type checks, however if you switch to using the second Api, where it's a tuple of functions, then it doesn't type check. I thought mapped types already work fine with tuples, however this does not appear to be the case. And if it's not possible to map into a tuple type like this, then we can't really create the signature we want for withF.

CMCDragonkai commented 2 years ago

Actually maybe the Resource can return a record type instead, and then we could actually map into it.

However we will need to also index into the Promise type to get rid of the promise wrapper.

CMCDragonkai commented 2 years ago

Based on this answer: https://stackoverflow.com/a/60713409/582917, there appears to be a way to do this.

We will need to dispense with ReturnType, it just doesn't work. However ReturnType relies on conditional type expressions and the usage of the infer keyword, which we can use as well.

Here's an example:

type Call<R> = (...args) => R;

type FunctionReturns<T extends Record<number, Call<any>>> = {
  [K in keyof T] : T[K] extends Call<infer R> ? R: never
}

type FunctionReturns2<T extends readonly Call<any>[]> = {
  [K in keyof T] : T[K] extends Call<infer R> ? R: never
}

function getReturns<
  T extends (readonly [Call<any>] | readonly Call<any>[])
>(fs: T): FunctionReturns2<T> {
  // Escape hatch!
  return fs.map(f => f()) as any;
}

getReturns([() => 123, () => 'abc', () => [1,2] as const])

// force inference as a tuple, and not as an array
const fs = [() => 123, () => 'abc'] as const;

getReturns(fs);

The getReturns here can either return tuple type or array type depending on what the input type is. This is what the as const does, it ensures that TS infers the array as a tuple, by default TS assumes it is an array.

Notice I have 2 variants. The FunctionReturns2 constrains T to be an tuple type.

It is essential to understand that tuples are "readonly arrays". It has to be written like readonly X[] or readonly [X].

CMCDragonkai commented 2 years ago

Note that the equivalent FunctionReturns that uses what ReturnType does is like this:

type ReturnType<T extends (...args: any) => any> = T extends (...args: any) => infer R ? R : any

However it's better for us to have a separate Call type just so that we can refer to it in the getReturns.

CMCDragonkai commented 2 years ago

Some solutions are incoming...

type ResourceAcquire<T = void> = () => Promise<[() => Promise<void>, T?]>;

type Resources<T extends readonly ResourceAcquire<any>[]> = {
  [K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}

async function withF<
  ResourceAcquires extends readonly [ResourceAcquire<unknown>] | readonly ResourceAcquire<unknown>[],
  Result
>(
  resourceAcquires: ResourceAcquires,
  f: (...resources: Resources<ResourceAcquires> & unknown[]) => Promise<Result>
): Promise<Result> {
  const releases: Array<() => Promise<void>> = [];
  const resources: Array<unknown> = [];
  try {
    for (const resourceAcquire of resourceAcquires) {
      const [release, resource] = await resourceAcquire();
      releases.push(release);
      resources.push(resource);
    }
    return await f(...resources as unknown as Resources<ResourceAcquires>);
  } finally {
    releases.reverse();
    for (const release of releases) {
      await release();
    }
  }
}

async function x() {
  let count: number = 0;
  const h: ResourceAcquire<number> = async () => {
    ++count;
    return [async () => { --count; }, count];
  };
  await withF(
    [
      h,
      async () => {
        return [async () => { }];
      }
    ],
    async (c, cs) => {
      console.log(c, cs);
      return c;
    }
  );
}

x();

One of the strange things is when we make the resource array constant. Functions that don't have readonly applied if given a readonly array will fail. If we say the parameter is readonly, we're guaranteeing that we won't modify this array. So having a strictly readonly array is technically more flexible. So a few more tweaks on the above and things should work.

CMCDragonkai commented 2 years ago

Slight extension:

type ResourceAcquire<T = void> = () => Promise<readonly [() => Promise<void>, T?]>;

type Resources<T extends readonly ResourceAcquire<any>[]> = {
  [K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}

async function withF<
  ResourceAcquires extends readonly [ResourceAcquire<any>] | readonly ResourceAcquire<any>[],
  Result
>(
  resourceAcquires: ResourceAcquires,
  f: (...resources: Resources<ResourceAcquires> & any[]) => Promise<Result>
): Promise<Result> {
  const releases: Array<() => Promise<void>> = [];
  const resources: Array<any> = [];
  try {
    for (const resourceAcquire of resourceAcquires) {
      const [release, resource] = await resourceAcquire();
      releases.push(release);
      resources.push(resource);
    }
    return await f(...resources as unknown as Resources<ResourceAcquires>);
  } finally {
    releases.reverse();
    for (const release of releases) {
      await release();
    }
  }
}

async function x() {
  let count: number = 0;

  await withF(
    [
      async () => {
        return [async () => { }];
      }
    ] as const,
    async (c) => {
      console.log(c);
      return c;
    }
  );

  const h: ResourceAcquire<number> = async () => {
    ++count;
    return [async () => { --count; }, count];
  };

  await withF(
    [
      h,
      async () => {
        return [async () => { } ];
      }
    ] as const,
    async (c, cs) => {
      console.log(c, cs);
      return c;
    }
  );

  const arr = [
    h
  ] as const;

  await withF(arr, async (c) => {
    console.log(c);
  });

  const y = async () => {
    return [async () => {}] as const;
  }

  const arr2 = [
    y
  ] as const;

  await withF(arr2, async (c) => {
    console.log(c);
  });

}

x();

Notice the usage of as const, that is required if there's no explicit typing of the resources as a tuple.

CMCDragonkai commented 2 years ago

Changed to any instead of unknown:

type ResourceAcquire<T = void> = () => Promise<readonly [() => Promise<void>, T?]>;

type Resources<T extends readonly ResourceAcquire<any>[]> = {
  [K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}

async function withF<
  ResourceAcquires extends readonly [ResourceAcquire<unknown>] | readonly ResourceAcquire<unknown>[],
  Result
>(
  resourceAcquires: ResourceAcquires,
  f: (...resources: Resources<ResourceAcquires> & any[]) => Promise<Result>
): Promise<Result> {
  const releases: Array<() => Promise<void>> = [];
  const resources: Array<unknown> = [];
  try {
    for (const resourceAcquire of resourceAcquires) {
      const [release, resource] = await resourceAcquire();
      releases.push(release);
      resources.push(resource);
    }
    return await f(...resources as unknown as Resources<ResourceAcquires>);
  } finally {
    releases.reverse();
    for (const release of releases) {
      await release();
    }
  }
}
CMCDragonkai commented 2 years ago

Attempting to do the withG variant, and discovered this interesting behaviour:

async function* g1 () {
  const x = yield 1;
  yield x;
  return 3;
}

async function* g2 () {
  try {
    return yield* g1();
  } finally {
    console.log('DONE!');
  }
}

async function main () {
  const g = g2();

  console.log('first', await g.next());
  console.log('second', await g.next(2));
  console.log('third', await g.next());

  for await (const v of g2()) {
    console.log(v);
  }

}

main();

The result is:

first { value: 1, done: false }
second { value: 2, done: false }
DONE!
third { value: 3, done: true }
1
undefined
DONE!

The finally block runs before the value is returned. This isn't really a huge issue since by that time, all asynchronous work has been done.

Furthermore the return value is not processed by for await. It only look at the values being yielded.

CMCDragonkai commented 2 years ago

The withF and withG are now integrated in #266. Example code follows:

```ts type ResourceAcquire = () => Promise< readonly [ResourceRelease, Resource?] >; type ResourceRelease = () => Promise; type Resources[]> = { [K in keyof T]: T[K] extends ResourceAcquire ? R : never; }; /** * Make sure to explicitly declare or cast `acquires` as a tuple using `[ResourceAcquire...]` or `as const` */ async function withF< ResourceAcquires extends | readonly [ResourceAcquire] | readonly ResourceAcquire[], T, >( acquires: ResourceAcquires, f: (resources: Resources) => Promise, ): Promise { const releases: Array = []; const resources: Array = []; try { for (const acquire of acquires) { const [release, resource] = await acquire(); releases.push(release); resources.push(resource); } return await f(resources as unknown as Resources); } finally { releases.reverse(); for (const release of releases) { await release(); } } } /** * Make sure to explicitly declare or cast `acquires` as a tuple using `[ResourceAcquire...]` or `as const` */ async function* withG< ResourceAcquires extends | readonly [ResourceAcquire] | readonly ResourceAcquire[], T = unknown, TReturn = any, TNext = unknown, >( acquires: ResourceAcquires, g: ( resources: Resources, ) => AsyncGenerator, ): AsyncGenerator { const releases: Array = []; const resources: Array = []; try { for (const acquire of acquires) { const [release, resource] = await acquire(); releases.push(release); resources.push(resource); } return yield* g(resources as unknown as Resources); } finally { releases.reverse(); for (const release of releases) { await release(); } } } export { withF, withG }; export type { ResourceAcquire, ResourceRelease }; ```

It's important to start to plan the usage of this for other domains like nodes and wherever there's a transaction callback, but the first one I'll be trying it out on is in the Vaults domain.

CMCDragonkai commented 2 years ago

Partially resolved by these commits which have been cherry picked into master:

@tegefaulkes once you rebase on master, you can also start using the RWLock and withF and withG in #310.

CMCDragonkai commented 2 years ago

The node connections are now using withF and withG. The #266 will be introducing them into the vaults domain.

CMCDragonkai commented 2 years ago

While working on #326, I noticed that the KeyManager doesn't have any locking. This is because locking concepts were developed afterwards. It should have locking built in to many of the methods like renewKeyPair. We don't want race conditions between renewKeyPair and resetKeyPair and resetRootCert.

CMCDragonkai commented 2 years ago

Some things to note:

  1. The NodeConnectionManager.acquireConnection does not need to be async.
  2. The withG isn't being used in NodeConnectionManager due to some issue with try catch, but this should be investigated
  3. In NodeGraph I'm prototyping with this:

     protected lock: RWLock = new RWLock();
    
     public acquireLockRead(lazy: boolean = false): ResourceAcquire<RWLock> {
       return async () => {
         let release: ResourceRelease;
         if (lazy && this.lock.isLocked()) {
           release = async () => {};
         } else {
           const r = await this.lock.acquireRead();
           release = async () => r();
         }
         return [release, this.lock];
       };
     }
    
     public acquireLockWrite(lazy: boolean = false): ResourceAcquire<RWLock> {
       return async () => {
         let release: ResourceRelease;
         if (lazy && this.lock.isLocked()) {
           release = async () => {};
         } else {
           const r = await this.lock.acquireWrite();
           release = async () => r();
         }
         return [release, this.lock];
       };
     }
  4. Take note of the lazy boolean. This is because some of the public methods of the class may be called by other public methods of the same class, we want to avoid deadlocking itself This is because neither async-mutex nor RWLock is an re-entrant mutex: https://en.wikipedia.org/wiki/Reentrant_mutex. Right now "ownership tracking" is only done directly with the lock being a protected property, however it's possible to acquire this lock under special cases.
  5. This may mean we replace _transaction with withF([this.acquireLockWrite(true)], ...) all the time. If that is the case, we may want lazy to be true by default...
  6. In RWLock I've added 2 additional methods isLockedReader, and isLockedWriter, perhaps this should be used by the lazy acquisition?

Further prototyping is required to realise the application of this to all domains that are using _transaction. Especially with ACL and GestaltGraph.

CMCDragonkai commented 2 years ago

Right now the _transaction method is actually buggy. This is because it checks if the lock is already locked. This means if a class wanted to expose 2 public methods which requires mutual exclusion, using _transaction won't work because if the lock is already locked, the second invocation just passes through.

So the usage of it was to allow re-entrancy, however it's not able to track that these 2 independent calls.

Consider this difference:

class A {
  // Both `pubA` and `pubB` are locking
  pubA () { this.pubB() }
  pubB () { ... }
}

If the call stack was:

A.pubA() -> A.pubB()

We would expect that pubB should be re-entrant, so it doesn't call a deadlock.

However if the call stack was instead:

A.pubA()
A.pubB()

Where they are called simultaneously & concurrently, then now we want want pubA and pubB to be blocking each other here.

So _transaction isn't enough, and neither is the lazy boolean. We need a more automatic way to track the ownership and call stack.

CMCDragonkai commented 2 years ago

The quick solution is to extract the common functionality out of pubA and pubB into a protected method which is not locking. This way pubA calls the protected method, and both methods can block each other as normal. This means re-entrancy isn't required because its built into the structure of the code. It just means we have to be careful not to have public methods call other public methods when these public methods are state mutating procedures.

CMCDragonkai commented 2 years ago

It seems the DBTransaction was the only way to maintain context in a call stack/graph.

Basically methods of INodeManager take a tran: DBTransaction as the first parameter. This allows one to build up a transactional context for methods to run in.

Then if pubA calls pubB, they pass the same transactional context. This was made as the first parameter, since they all required. However unlike INodeManager, if we were doing this we would need to make this an optional parameter at the very end, so that normal calls to pubA would still work.

This context is different from the instance context, since it only exists during the call stack.

That would be the only way to track the lock ownership on a call-basis.

If we were do this for RWLock, then it could be done.

However NodeGraph also uses the DB. So the question is that for such a lock, it would make sense that we would want to tie our lock context to the DB locking as well.

It is possible to create an acquisition function for the DB transaction:

  public acquireTransaction: ResourceAcquire<DBTransaction> = async () => {
    const tran = new Transaction({ db: this.db });
    return [
      async (e?: Error) => {
        if (e == null) {
          try {
            await tran.commit();
          } catch (e) {
            await tran.rollback();
            throw e;
          }
          await tran.finalize();
        } else {
          await tran.rollback();
        }
      },
      tran
    ];
  };

To do so, a little change must be done on withF and withG:

type ResourceRelease = (e: Error | undefined) => Promise<void>;

It's backwards compatible, but basically means the resource releaser will now have a reference to the error in case it was thrown during the acquisition or the handler call.

CMCDragonkai commented 2 years ago

Applying the concept of withF and withG to DBTransaction means that different domains should share a transaction context in order to maintain atomicity.

Consider a "transaction" that takes place between ACL and NodeGraph. An example usage might look like this:

withF([
  acquireTransaction()
], async ([tran]) => {
  await nodeGraph.doSomething(tran);
  await acl.doSomething(tran);
});

The result is that any mutation here is atomic between nodeGraph and acl. Compare this to GestaltGraph where the GG just locks both itself and ACL to perform mutation. There's no atomicity, so it's possible for mutations to ACL to persist while mutations to GG is lost.

At the same time it's a sledgehammer because it locks the entire GG and ACL together. There's no fine-grained concurrency control here.

The usage of the transaction, allows one to be more finegrained, since the interactions operate entirely on the transaction context. Note that transactions are sort of limited due to iteration, but that may be resolved by https://github.com/MatrixAI/js-db/issues/4.

It is the combination of smaller locks, and sophisticated locks that RWLock along with the transaction snapshot system that enables fine-grained concurrency control.

CMCDragonkai commented 2 years ago

What if nodeGraph.doSomething wants to interact with ACL? Like in GestaltGraph. What would this look like?

We want to use withF or withG everywhere where this pattern might take place, so it's turtles all the way down.

Methods should take a transaction parameter, most likely optionally so that the method can still work within its own transaction if it doesn't need to be shared between any other higher-level methods.

class NodeGraph {
  public doSomething(tran?: DBTransaction) {
    if (tran == null) {
      return await withF([ 
        acquireTransaction()
      ], async ([tran]) => {
        return this.doSomething(tran);
      });
    }
    this.acl.doSomething(tran);
    // continue with mutations
  }
}
CMCDragonkai commented 2 years ago

The acquireTransaction function mentioned above cannot be the current DB.transact method. It's not in a resource context signature style that is required by withF and withG, it also assumes a default lock on the DB itself, which would not be fine-grained.

Furthermore acquireTransaction shouldn't be bundling locks with it like transact is doing. Instead we are going to factor out the locking, like it is done for INodeManager doing locks on individual inodes. This is sort of like "unbundling".

Again this becomes easier, as methods can now decide what to lock when performing the transaction.

await withF([lock1, lock2, acquireTransaction], async ([, , tran]) => { ... });

The DBTransaction by itself has no locking applied. Without locks, it is possible some transactions may conflict with each other and it would affect the desired isolation level. Our desired isolation level usually is read-committed.

  1. Dirty reads is not possible if all threads are using transactions, you cannot read uncommitted transactions regardless of locking
  2. Non-repeatable reads is possible because we aren't an iterator snapshot, and we are not saving what we have read in the in-memory snapshot, so if another thread's transaction commits, our transaction may see it, because our 2nd read may see the first one.
  3. Phantom reads is possible for the same reason that non-repeatable reads is possible

So when should locks be applied at all? When transactions need mutual exclusion against each other. By default they don't actually need any mutual exclusion. Since as long as everybody uses transactions, read committed is fulfilled.

When do transactions need mutual exclusion? Usually when you are doing a "write after read" or "read after write". Basically anything involving a write (not just singular writes), or if a bunch of writes must be done in an atomic way. This is to prevent inconsistency when a read & write occur together.

CMCDragonkai commented 2 years ago

Now to ensure that we don't have nested lock acquisition, we often used _transaction before to not bother locking if already in a locked context. That's another one of these problems. This is the re-entrancy problem as discussed above.

Now we discussed 3 possible solutions:

  1. Build in the re-entrancy into the code structure by carefully only using protected methods to call each other, and public methods are entry points. https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1046484765 It means public methods cannot call other public methods if those methods are doing mutation.
  2. Pass the locks like we pass the tran down the call graph, this allows downstream methods to know that a certain lock as already been acquired.
  3. Designate a layer that cannot call each other. This is similar to solution 1. and is often used in web apps. Controllers in the MVC model cannot call each other, thus this is the "bootstrap" point for any concurrency control. In solution 1., that would make all public methods the controller layer. However this creates a lot of overhead and we lose flexibility to have our class domains call other class domains. And because we decided that our GRPC handlers are thin wrappers around our domains and our domains are the actual controllers, then we cannot really do this.

It seems solution 2. is the most flexible, it just means if locking needs to be done for a given operation, any higher-operation using that lower-operation must also be able to pass that lock around, and both transactions and associated locks must be shared.

To reduce overhead, we can tie up locking and transactions together into a single call:

acquireTransaction([lock1, lock2])

So that that the passed transaction can also have associated locks, and then trust that callers have the right locks passed in. This may require a more powerful type like:

DBTransaction<[Lock1, Lock2>

Not sure atm, so let's see.

CMCDragonkai commented 2 years ago

Continuing the idea of dynamic fine-grained locks.

It make sense that some operations will require locks that range over a particular set of keys when using the DB.

These locks would then be combined with the transaction together to create higher-levels of isolation then what is by default provided by the DB transaction.

To do this, we introduce Locks class that is a generic dynamic read-write lock collection. It primarily uses RWLockWriter so that we get some flexibility with our locks and pretty much all our locking usecases are satisifed with it except where IPC locking is concerned.

class Locks {

  protected _locks: Map<string, RWLockWriter>;

  get locks(): ReadonlyMap<string, RWLockWriter> {
    return this._locks;
  }

  public lockRead(id: string): ResourceAcquire<RWLockWriter> {
    return async () => {
      let lock = this._locks.get(id);
      if (lock == null) {
        lock = new RWLockWriter();
        this._locks.set(id, lock);
      }
      const [lockRelease] = await lock.acquireRead();
      return [
        async () => {
          await lockRelease();
          if (!lock!.isLocked()) {
            this._locks.delete(id);
          }
        },
        lock
      ];
    };
  }

  public lockWrite(id: string): ResourceAcquire<RWLockWriter> {
    return async () => {
      let lock = this._locks.get(id);
      if (lock == null) {
        lock = new RWLockWriter();
        this._locks.set(id, lock);
      }
      const [lockRelease] = await lock.acquireWrite();
      return [
        async () => {
          await lockRelease();
          if (!lock!.isLocked()) {
            this._locks.delete(id);
          }
        },
        lock
      ];
    };
  }

}

Now in our operations like sameNodePerm:

  @ready(new aclErrors.ErrorACLNotRunning())
  public async sameNodePerm(
    nodeId1: NodeId,
    nodeId2: NodeId,
    tran?: DBTransaction,
  ): Promise<boolean> {
    const nodeId1Path = [...this.aclNodesDbPath, nodeId1.toBuffer()] as unknown as KeyPath;
    const nodeId2Path = [...this.aclNodesDbPath, nodeId2.toBuffer()] as unknown as KeyPath;
    if (tran == null) {
      return withF(
        [
          this.db.transaction(),
          this.locks.lockRead(
            dbUtils.keyPathToKey(nodeId1Path).toString()
          ),
          this.locks.lockRead(
            dbUtils.keyPathToKey(nodeId2Path).toString()
          ),
        ],
        async ([tran]) => this.sameNodePerm(nodeId1, nodeId2, tran)
      );
    }
    const permId1 = await tran.get(nodeId1Path, true);
    const permId2 = await tran.get(nodeId2Path, true);
    if (permId1 != null && permId2 != null) {
      return IdInternal.fromBuffer(permId1).equals(IdInternal.fromBuffer(permId2));
    }
    return false;
  }

As you can see here, if we are creating a transaction here, we are asking for to get read locks for both nodeId1Path and nodeId2Path which are derived from the KeyPath to the 2 keys that are being read.

However there's a problem here, and it's that it is possible to get into a deadlock.

If you were to do:

await Promise.all([
  sameNodePerm(nodeId1, nodeId2),
  sameNodePerm(nodeId2, nodeId1)
]);

This can result in a deadlock because the first op tries to get a lock for nodeId1, and the second op gets the lock for the nodeId2, and then they are both trying to get a lock for each other's node ID.

There are 2 solutions here:

  1. Enforce a lock hierarchy, by always sorting the requests for locks by id: string, this would ensure that both operations would request locks in the same order.
  2. Retry locking requests with jitter delay upon a deadlock timeout.

Solution 2 involves updating js-async-locks and integrating withTimeout and tryAcquire decorators. https://github.com/DirtyHairy/async-mutex#limiting-the-time-waiting-for-a-mutex-or-semaphore-to-become-available. We would then need to detect when we are in deadlock, by retrying the entire locking sequence.

Solution 1, could be done, by chaning the resource acquisitions to:

      return withF(
        [
          this.db.transaction(),
          this.locks.lockRead(
            dbUtils.keyPathToKey(nodeId1Path).toString()
            dbUtils.keyPathToKey(nodeId2Path).toString()
          ),
        ],
        async ([tran]) => this.sameNodePerm(nodeId1, nodeId2, tran)
      );

This would allow lockRead and lockWrite methods to reorder the locking sequence by sorting the strings. This allows the user to specify explicit order by using 2 or more this.locks.lockRead, and when they don't want explicit lock order, then just pass the requirements as parameters to the same function.

Note that lockRead and lockWrite would then be variadic functions, taking as many id: string as they need, and resorting them before attempting to lock.

CMCDragonkai commented 2 years ago

Created https://github.com/MatrixAI/js-async-locks/issues/2 to integrate timeouts to our locks in case we later want to retry.

But for now I think re-ordering to set a lock hierarchy is a better idea.

One thing though, if another function that has an existing transaction context is calling sameNodePerm, it may have to acquire those locks even if it is just passing the tran in.

Or it may not, it depends on the isolation level required. The only reason to lock here for sameNodePerm is because some other operation may write to node 1 without finishing to write to node 2. But full consistency is not that important unless sameNodePerm must then be consistent with all other DB methods.

This is somewhat different from having locks built into a transaction through SQL queries like SELECT FOR UPDATE....

CMCDragonkai commented 2 years ago

However even after adding in sorting, it is still possible to have deadlocks, so we have to be careful coding up the transactions, and eventually build up a deadlock detection system, and integrate the retry system to catch them. Any detected deadlocks should eventually be fixed in the code though.

CMCDragonkai commented 2 years ago

Some notes on the usage of with* convenience wrappers around the withF and withG utility functions: https://github.com/MatrixAI/js-encryptedfs/pull/63#issuecomment-1092630091

This led to js-async-locks to become 2.1.1. Not a big deal for the packages that are still using 2.1.0 though.

CMCDragonkai commented 2 years ago

Also the passing of tran object around is quite flexible, but kind of verbose, would be nice if we have a monad context that operations can run within. Only issue is that that would require an object like iNodeMgrTran.dirCreate(). But cannot extend to other objects like a transaction between domains. It only works because say ACL, and other domains would use DBTransaction together. A shared monadic context, would need to combine a transactional "proxy" object that allows calls to all the domain regular methods.

See https://www.measurethat.net/Benchmarks/Show/6274/4/access-to-proxy-vs-object, most important would be to avoid nesting proxies as per https://gomakethings.com/better-proxy-performance-in-vanilla-js/. Accessing proxy vs accessing object isn't that bad. It's just an extra pointer dereference.

I could imagine something like:

// Each domain uses their own custom wrapper withTransactionF that creates a transaction object with associated locking properties
this.withTransactionF(async (tran) => {
  // tran is DBTransaction shared between domains
  // associated locks is setup ahead of time with withTransactionF
  // other resources required would be designed ahead of time
  this.doSomething(tran);
  acl.doSomething(tran);
  gg.doSomething(tran);
});

This would be a "proxy" based API:

withTransactionF(this, acl, gg, async (thisProxy, aclProxy, ggProxy) => {

  // proxy
  thisProxy.doSomething();

  // can't do this
  // these have to be part of the proxy
  // acl.doSomething();
  // gg.doSomething();

  aclProxy.doSomething();
  ggProxy.doSomething();

});
CMCDragonkai commented 2 years ago

The proxy is just used so that it will maintain the transaction context across calls to the underlying domain.

The callback is still needed to know when a context must needs to be started and ended. It naturally "scopes" the context. The proxy objects mean that you don't need to do explicit passing of tran around.

This possibly avoids having to have a tran parameter on every method, but instead the tran property can be referenced via this instead.

The proxy can actually be a "decorator" of sorts. But it's important to maintain type-safety, so that the proxy can still be used whenever some function requires the same domain type.

The creation of the proxy is therefore the same as any other resource, and fits in the resource API.

withF([ acl.transactionProxy(), gg.transactionProxy(), async ([aclProxy, ggProxy]) => {
  // now use the proxies
});

This abstracts above the transaction() calls which return a DBTransaction. But now instead creates a proxy of an object that contains the transaction context.

One can abstract DBTransaction to a generic "Software Transaction", but ultimately based on some sort of state. It can take a "transaction" interface, that any thing that has state can provide. Of course one has to fix the isolation level of that transaction here.

But you'd need to figure out how to incorporate locks as well in their relevant domains too.

Just a crazy idea, but the benefit is just to avoid having to pass the tran around.

For future investigation.

CMCDragonkai commented 2 years ago

I feel like there's a way to extend the ES6 promise for one situation (or better a decorator around a promise, so that subsequent promises remain the same), and this would enable you to do something like:

async method(tran?) {
  const tran = makeTran(tran, ino);
  await doAsyncStuff();
  doSyncStuff();
}

And after await method() finishes, at the very end the tran is comitted. It would require makeTran to return an augmented promise. One that takes the then method, and adds something to be completed at the end using finally.

Unfortunately I'm not sure how to do it. I tried adding a finally into the augmented promise. But finally activates as soon as the promise resolves regardless of thens. You need to intercept at specific then but you don't know when this is because you don't know how many thens there may be.

tegefaulkes commented 2 years ago

Can't you already await a promise multiple times to get the same result? wouldn't this be like caching the returned promise rather than augmenting the promise itself?

CMCDragonkai commented 2 years ago

Not sure what you mean, but I've abandoned that approach. I'm moving ahead with how it is done right now in EFS. Just need to abstract the locks collections structure from both DB and PK into async-locks with https://github.com/MatrixAI/js-async-locks/issues/5. I realised I forgot to filter out the same lock key which may cause deadlocks in a single call to locks.lock(1, 1). But that's going to be solved in https://github.com/MatrixAI/js-async-locks/pull/6

CMCDragonkai commented 2 years ago

During our meeting, we discussed that there can be a slight overhead with using the DB transaction with locks.

Consider ACL.sameNodePerm (https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1084225725). Suppose it takes tran?: DBTransaction as the last parameter.

Now technically this doesn't require a lock, since a check can be made through 2 reads, even if those reads may be inconsistent. But this is a good example to illustrate the problem.

Now if you don't pass a tran, the method knows what keys to lock when starting the transaction. Namely the 2 node ids need read lock so they can be checked.

But if you do pass tran, then the system won't attempt to lock those keys.

So the caller of sameNodePerm will need to know what it needs to lock before it calls ACL.sameNodePerm if it is intending to pass the tran. This knowledge has to be bubbled up by the programmer to the very beginning of wherever that transaction context starts. Which could be several layers up.

It's going to be quite complex to always keep track of what keys needs to be locked.

In SQL, this problem doesn't exist in this way, because locks are within the transaction context, not outside.

Then if one were to call sameNodePerm within a SQL transaction, one would just lock those as part of a SELECT ... FOR SHARE. Note that SQL normally isn't even composable, so these problems tend to require the programmer to structure things differently, and not through just function calls but through query builders.

Now this would mean that you are attempting to acquire a lock while within the transaction. Is that necessarily a bad thing?

I reckon sameNodePerm could try to acquire those locks even if the outside were to pass in a transaction.

However right now, nested acquisitions would result in a deadlock.

withF([this.lock('A')], async () => {
  withF([this.lock('A')], async () => {
    // DEADLOCK
  });
});

I believe this problem was mentioned already in https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1068719312, and the solution proposed was solution 2, where locks are also tracked as they are passed down the call graph with the tran. While https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1068713673 tells us when mutual exclusion would actually be necessary and when you can just use transaction by itself without further locking.

So there are some further things to do:

  1. Bring locks back into DBTransaction or at least provide an augmented object that combines the LockBox with the DBTransaction into a composed resource, such that subsequent acquisition of certain locks within the same tran context will just "flow through", and won't result in a nested deadlock. Note that this will however bring back the possibility of deadlocks due to lack of enforced lock hierarchy, as the locking order is now determined by the call graph hierarchy. The only way around this is to design your code so that this doesn't happen, or to introduce deadlock tracking here with our new timeout mechanism. https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1084235910
    • Combined LockBox and DBTransaction concept to track nested deadlocks
    • Lock hierarchy across call graph cannot be enforced, so deadlocks can now occur due to mis-ordered lock hierarchy
    • Will require deadlock detection by using timeouts and retries
  2. The constant propagation of the tran parameter is kind of annoying, and future investigation into proxies & decorators could help (will require some typescript magic though): https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1094547431
CMCDragonkai commented 2 years ago

Note that combined Lockbox and DBTransaction is already in EncryptedFS, only that the Lockbox isn't really tracked along with DBTransaction. It just doesn't return it as a resource.

CMCDragonkai commented 2 years ago

Some further reading of:

  1. https://adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-In-Pictures.html#software-transactional-memory
  2. https://rhishikesh.com/posts/mvcc_stm/
  3. https://convincedcoder.com/2018/09/01/Optimistic-pessimistic-locking-sql/

Tells me that our LevelDB transactions are quite similar to STM. In our case, the state is leveldb, and the transactions are all just software and in-memory in a single process.

One major difference is that the transactions in STM are meant to be "optimistic", and they are designed to take a snapshot of the state before the transaction starts, and then when they commit they check if their initial state is still the same before committing, and if it is not, the whole transactional operation is retried at the beginning but with the new value as the initial state. The user code which is provided as a function that takes the initial state is then re-ran, but it can decide to abort the transaction if the initial state is unsatisfactory.

This is a form of optimistic concurrency control.

Our DBTransaction does not do this. Firstly it does not keep track of the initial state of the database. This is related to snapshot isolation? (I'm not sure if the definition of snapshot isolation is different across different contexts). We do have a snapshot, but our snapshot doesn't keep track of the initial state of the DB. Only if we were to use the tran.iterator() would this occur. We can expose this the end user if they really want to do this. But comparing equality between the transactional snapshot provided by the iterator seems quite inefficient, as many STM examples focus on a single value that can be atomically checked and not the entire database.

And during commits, it does not check if the initial state or snapshot state is consistent with the underlying database.

There is a way to make our DBTransaction similar to STM transactions.

One major change is to only compare the values that have been touched. This means any get, put or del or iterator operation would "touch" keys in the underlying database. Their original value gets recorded with a hidden iterator() that is created at the beginning of the transaction. This may also mean that we store the read value into the snapshot and end up with repeatable-read isolation level. Then during a commit operation, we compare the touched values from our iterator with what's in the DB (to do this we have to create another iterator on the DB at the point of committing), and if there are any discrepancies, reject the commit.

The problem is that the DBTransaction doesn't know what produced the transaction, so it doesn't know what to "re-run". At this point it could throw an exception and expect the user of the transaction to deal with it like:

while (true) {
  const [tranRelease, tran] = await tranAcquire();
  await tran.put('x', 'y');
  try {
    await tranRelease();
  } catch (e) {
    if (e instanceof DBCommitWriteConflict) {
      // we have to restart the transaction entirely, because the `x` was a conflict
      continue;p
    }
    throw e;
  }
}

Which means convenience methods would have to be used to deal with this situation in a nicer way, and they can just rerun the callback that is passed in.

Seems like we have quite a lot of flexibility to manipulate our DBTransaction, but depending on the way we do it, it will impact how we organise our code that uses it. It's big cross cutting concern.

CMCDragonkai commented 2 years ago

Further reading here https://www.sobyte.net/post/2022-01/rocksdb-tx/ shows that rocksdb actually supports both optimistic transactions and pessimistic transactions.

The optimistic transaction follows what I wrote above. It also just throws an exception and expects the user to figure out what they want to do and that could mean a retry, or a loop goto with continue. But the DB doesn't control how this is to be done.

In the pessimistic case, it bundles up a lock manager into the transaction, and this lock manager is also detecting deadlocks, since by allowing locks to form within the transaction, deadlocks can occur. It also addresses re-entrant locking by allowing locks to be acquired if already acquired in the same transaction. But locks that are just deadlocked by other transactions due to different lock-ordering, the only solution is to use a timeout mechanism.

Once timedout, it also can do some diagnostics to find out why the transaction failed. In this case, you can track what locks you were waiting on. And that this points to particular transactions with transaction IDs. And you can check what those transactions were waiting for, and if they were waiting for a lock held by your transaction ID, then you have a circular deadlock.

CMCDragonkai commented 2 years ago

This could mean that task 1 in https://github.com/MatrixAI/js-polykey/issues/294#issuecomment-1106042197 is just about bringing together a proper pessimistic DB transaction.

However many places in PK, it can probably do well with just optimistic DB transaction. More prototyping needed here.

CMCDragonkai commented 2 years ago

The js-resources has been been updated so that ResourceAcquire functions will take an array of already allocated resources. This enables withF and withG to pass allocated resources in-order to the resource acquisition functions. Basically this means resource acquisition may be parameterised based on a prior resource acquisition.

https://github.com/MatrixAI/js-resources/pull/1

Note that I tried to do it a bit more elegantly with using a POJO to form a resource DAG (https://github.com/microsoft/p-graph), but this was difficult to do, and didn't allow users to form a strict order. And may require more machinery like memoization. So for now this is usable.

However I couldn't get the types correct so explicit types are needed for the resource types.

An example of this:

    withF([
      iNodeMgr.inoAllocation(),
      ([rootIno]: [INodeIndex]) => iNodeMgr!.transaction(rootIno)()
    ], async ([rootIno, tran]) => {
      // ... use rootIno and tran
    });

The reason this was done is because iNodeMgr.transaction() returns the ResourceAcquire<DBTransaction>. But the acquisition of DBTransaction itself did not take parameters, as this was done by transaction().

CMCDragonkai commented 2 years ago

I'm going to need to implement pessimistic transaction support in DBTransaction as well as a deadlock detector to ensure that we don't have an unscalable approach of expecting every call to know what locks they need to lock before call other domain's methods. Doing so should allow most of our domain methods to be composable. Alternative is optimistic transactions which should function similar to software transactional memory, which should enable even more composability.

CMCDragonkai commented 2 years ago

So most of the updates to the PK codebase should be doable, but the final DBTransaction update would require refactoring the locking before transaction to be done within the transaction.

As for deadlock detection, we would consider that a programmer error when discovered, so when deadlock is discovered, an exception is thrown as normal. Users can retry of course. The PK application should not crash in this instance however.

CMCDragonkai commented 2 years ago

This will be fixed when merging #366 before #326 rebases on top.

CMCDragonkai commented 2 years ago

Pessimistic shouldn't be needed. Working on optimistic instead. It's alot more flexible.

CMCDragonkai commented 2 years ago

Snapshot Isolation based OCC transactions should be backwards compatible with the existing usage of DBTransaction in EFS.

The main difference is the possibility of ErrorDBTransactionConflict exception, and the need to handle write-skews.

In EFS, it will continue to use locks because an OCC transaction can be converted to a PCC transaction as long as locks are used.

In EFS, it's API demands PCC behaviour, therefore it will continue to use locks, even if it updates to the new SI transactions. Thus the SI implementation will be backwards compatible. Also it's usage of locking should mean that write-skews cannot occur.

However if upgrading to the SI implementation results in EFS throwing ErrorDBTransactionConflict, then this is an indication of a concurrency bug in EFS because its usage of locks should prevent any write conflicts. Thus it should be fixed such that ErrorDBTransactionConflict cannot ever be thrown in EFS.

One happy change is the expectation that the reads will now be consistent entirely with get and iterator. This means that there's no need to ensure you create an iterator up front to get consistent reads. One can iterate on level path and perform get on other key paths without worries. This behaviour is also backwards compatible of course, since there's no place where we are expecting inconsistent reads in EFS, we have always used the multi-iterator pattern to maintain consistent reads.

The situation is different in PK, as soon as it upgrades to SI transactions, it will basically drop all usages of locking with respect to the DB. However it may preserve locking in situations where write-skew may occur. Write skew occurs when one transaction writes, and another transaction reads, and some consistency rule is broken. See:

A variety of solutions are possible: materialize the conflict, use read locks... etc.

In the long term, we may upgrade the transaction from SI to SSI (serializable snapshot isolation) which was a recent invention from 2008, and this will even prevent write-skews, and thus all DB locks can be completely dropped.

Now when a conflict occurs, we may decide to auto-retry. However this is complicated by other side-effects that may occur. Only if at least one of these is true:

Can you do an auto-retry.

Auto-retries should not be done when the action should be changed due a change in state. What "should" means depends on the usability of the program.

So there's an "roadmap" for the transaction development:

  1. First introduce SI transactions which are backwards compatible
  2. Secondly bring back opt-in key-locking into SI transactions that enable PCC transactional behaviour, this means PCC locks currently used can be dropped, but deadlock detection becomes important
  3. Thirdly introduce SSI, so that write-skew handling code can be dropped

The second and third phases do not block our testnet deployment. They just improve our software model, reduce future bugs, and reduce entropy in our code.

CMCDragonkai commented 2 years ago

@tegefaulkes

With the new DB integration, there are 2 places where locking will be necessary to ensure serial counter updates:

  1. notifications
    • message count - this is a cardinal number, it is incremented on receive and decremented on delete
  2. sigchain
    • sequence number - this is an ordinal number, this is incremented on each new claim entered into the sigchain

To deal with these, we need to add locking before starting the transaction, and only do this as part of the public methods. If these 2 properties may be via a public method, then we should be using a RWLockWriter, if these 2 properties are only written to from public methods, then a Lock suffices.

The DB has no knowledge about these locks atm, so they occur outside the transaction as properties on the Sigchain and NotificationManager.

When acquiring locks for these transactions do it in this order:

withF([this.lock.write(), this.db.transaction()], async ([, tran]) => {
  // use tran
});

It is important to acquire the lock prior to the transaction to avoid building up resources to hold the transaction while waiting for the lock.

Note that when the DB gains the ability to lock things internal to the transaction, that is PCC control, this can be replaced with just:

withF([this.db.transaction()], async ([tran]) => {
  await tran.lock('counter');
  await tran.get('counter');
});

The API hasn't been fully worked out for PCC locking, it's possible locking can be integrated directly into get, put, and del operations. And we still have to figure out re-entrant locking and deadlock detection. So for now PK will just use locks as discussed above without expecting the DB supporting locking.

Even when PCC is used, ErrorDBTransactionConflict can still occur, that's expected if a write-set conflict occurs.

Still to do is to identify where write-skews may occur.

CMCDragonkai commented 2 years ago

Currently the iterator still doesn't support keyAsBuffer: false and valueAsBuffer: false, this is held up in the SI PR: https://github.com/MatrixAI/js-db/pull/18. I may be able to extract that out and cherry pick to master to release a minor version. Until then, you have to continue using dbUtils.deserialize. @tegefaulkes

tegefaulkes commented 2 years ago

Expanded the spec with details from the PR.

CMCDragonkai commented 2 years ago

The current staging of js-db https://github.com/MatrixAI/js-db/pull/38 has the new DBTransaction and the usage of rocksdb. It's still got some build issues to solve before it is ready. That update will end up resulting in @matrixai/db at 5.0.0. This also brings in a new update to @matrixai/async-locks and @matrixai/logger. These changes are somewhat breaking, so it should be done together. First by applying it to EFS, and then to PK. Similar to what we did before with the DB update.

For EFS, the update should focus on dealing with:

For PK, the update should focus on dealing with:

We should also add benchmarks to identify slowness, I think the rocksdb is a bit slower in normal gets/puts, but the iterator and transaction implementation should be alot faster since it's using the native C++ without additional JS abstraction.

The PR to js-polykey should also solve #244.

CMCDragonkai commented 2 years ago

Because lots of methods will now be transactional with an optional transaction, they will all need variants of something like this:

  public async pushTask(task, tran?: DBTransaction): Promise<void> {
    if (tran == null) {
      return this.db.withTransactionF(
        (tran) => this.pushTask.apply(this, [...arguments, tran])
      );
    }
    // Do the work
  }

Ideally we could abstract even the calling itself more... but arguments.callee is not available under strict mode.