MatrixAI / js-db

Key-Value DB for TypeScript and JavaScript Applications
https://polykey.com
Apache License 2.0
5 stars 0 forks source link

Introduce Snapshot Isolation OCC to DBTransaction #19

Closed CMCDragonkai closed 2 years ago

CMCDragonkai commented 2 years ago

Derived from #18

Description

This implements the snapshot isolation DB transaction.

This means DBTransaction will be automatically snapshot isolated, which means most locking will be unnecessary.

Instead when performing a transaction, there's a chance for a ErrorDBTransactionConflict exception which means there was a write conflict with another transaction.

Users can then decide on their discretion to retry the operation if they need to (assuming any non-DB side-effects are idempotent, noops or can be compensated). This should reduce the amount of locking overhead we need to do in Polykey. We may bubble up the conflict exception to the user, so the user can re-run their command, or in some cases, in-code we will automatically perform a retry. The user in this case can be the PK client, or the another PK agent or the PK GUI.

There is still one situation where user/application locks are needed, and that's where there may be a write-skew. See snapshot isolation https://en.wikipedia.org/wiki/Snapshot_isolation for more details and also https://www.cockroachlabs.com/blog/what-write-skew-looks-like/.

In the future we may upgrade to SSI (serializable snapshot isolation) which will eliminate this write-skew possibility.

~Additionally this PR will also enable the keyAsBuffer and valueAsBuffer options on the iterators, enabling easier usage of the iterators without having to use dbUtils.deserialize<T>(value) where it can be configured ahead of time.~ - already merged

See this https://www.fluentcpp.com/2019/08/30/how-to-disable-a-warning-in-cpp/ as to how to disable warnings in C++ cross platform.

Also see: https://nodejs.github.io/node-addon-examples/special-topics/context-awareness/

Issues Fixed

Tasks

Final checklist

CMCDragonkai commented 2 years ago

In trying out rocksdb, it appears there's differences at the C++ level.

In the new classic-level we have iterator_close and iterator_nextv, while in the current rocksdb, it's still using iterator_end and iterator_next.

Basically the same as leveldown, whereas classic-level upgraded their API. We should be able to change up the binding.cc and recompile.

Also note that rocksdb is 7.9 MiB, while leveldb was only half a megabyte. It's significantly longer to compile, which means I'm trying out multi-core builds for node-gyp build using npm_config_jobs=max.

CMCDragonkai commented 2 years ago

The npm_config_jobs=max should be backported to TS-Demo-Lib-Native.

CMCDragonkai commented 2 years ago

I had a thought that if we were developing solely on Nix, we could just use https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/libraries/rocksdb/default.nix and expect our binding.gyp to compile and at the end just link with whatever shared objects the rocksdb library has. It would be faster and less maintenance to keep rocksdb source code internally.

However that would preclude any ability to compile and build for windows and possibly other platforms. MacOS could still work with nix of course.

We don't have any nix integration with macos in the CI/CD yet, I did explore it but I wasn't sure if it was going to be more performance given that the macos runners don't have it.

CMCDragonkai commented 2 years ago

I've tested with rocksdb, the only issue right now is that rocksdb is still using the old abstract-leveldown compatible, whereas I'm currently using the abstract-level API.

The classic-level already upgraded to using abstract-level, but rocksdb hasn't.

It's only a few minor changes stemming primarily from this commit: https://github.com/Level/classic-level/commit/b9ba3dd6c5e5b36387d1efb040ad69edb6045dc9

2 methods are missing from rocksdb/index.cpp: https://github.com/MatrixAI/js-db/pull/19#issuecomment-1146639422, see the history from that point onwards: https://github.com/Level/classic-level/commits/main/binding.cc. The migration should be simple, it appears the C++ code is written very similarly between classic-level and rocksdb.

So will need to first upgrade the rocksdb/index.cpp in a similar vein to how classic-level did it, and maintain API parity with leveldb/index.cpp.

CMCDragonkai commented 2 years ago

This https://github.com/Level/classic-level/compare/ba729d223ec5ec54c5ad88e05c11e7bba71315f2...main shows us the diff from the point of forking from leveldown to classic-level main branch tip. We can use this to figure out what needs to be changed for rocksdb.

CMCDragonkai commented 2 years ago

I've synced up the index.cpp API with the changes in classic-level, however, I'm hitting a segmentation fault during db_open call. Will need to debug this further.

CMCDragonkai commented 2 years ago

Just a silly mistake it's working now.

CMCDragonkai commented 2 years ago

All DB.test.ts and DBIterator.test.ts works with rocksdb.

CMCDragonkai commented 2 years ago

Benchmarks for Matrix ML 1:

[nix-shell:~/Projects/js-db]$ npm run bench

> @matrixai/db@4.0.5 bench
> rimraf ./benches/results && ts-node -r tsconfig-paths/register ./benches

Running "DB1KiB" suite...
Progress: 100%

  get 1 KiB of data:
    40 632 ops/s, ±3.33%   | fastest

  put 1 KiB of data:
    25 323 ops/s, ±1.62%   | 37.68% slower

  put zero data:
    28 292 ops/s, ±2.01%   | 30.37% slower

  put zero data then del:
    15 497 ops/s, ±2.21%   | slowest, 61.86% slower

Finished 4 cases!
  Fastest: get 1 KiB of data
  Slowest: put zero data then del

Saved to: benches/results/DB1KiB.json

Saved to: benches/results/DB1KiB.chart.html
Running "DB1MiB" suite...
Progress: 100%

  get 1 MiB of data:
    544 ops/s, ±1.92%      | 98.01% slower

  put 1 MiB of data:
    533 ops/s, ±1.22%      | slowest, 98.05% slower

  put zero data:
    27 299 ops/s, ±1.70%   | fastest

  put zero data then del:
    13 880 ops/s, ±1.65%   | 49.16% slower

Finished 4 cases!
  Fastest: put zero data
  Slowest: put 1 MiB of data

Saved to: benches/results/DB1MiB.json

Saved to: benches/results/DB1MiB.chart.html

And matrix-vostro-5402-1

[nix-shell:~/Projects/js-db]$ npm run bench

> @matrixai/db@4.0.5 bench
> rimraf ./benches/results && ts-node -r tsconfig-paths/register ./benches

Running "DB1KiB" suite...
Progress: 100%

  get 1 KiB of data:
    58 249 ops/s, ±4.22%   | fastest

  put 1 KiB of data:
    30 449 ops/s, ±6.80%   | 47.73% slower

  put zero data:
    35 626 ops/s, ±7.86%   | 38.84% slower

  put zero data then del:
    16 480 ops/s, ±6.61%   | slowest, 71.71% slower

Finished 4 cases!
  Fastest: get 1 KiB of data
  Slowest: put zero data then del

Saved to: benches/results/DB1KiB.json

Saved to: benches/results/DB1KiB.chart.html
Running "DB1MiB" suite...
Progress: 100%

  get 1 MiB of data:
    578 ops/s, ±2.75%      | slowest, 98.11% slower

  put 1 MiB of data:
    673 ops/s, ±0.53%      | 97.8% slower

  put zero data:
    30 550 ops/s, ±4.90%   | fastest

  put zero data then del:
    18 064 ops/s, ±8.12%   | 40.87% slower

Finished 4 cases!
  Fastest: put zero data
  Slowest: get 1 MiB of data

Saved to: benches/results/DB1MiB.json

Saved to: benches/results/DB1MiB.chart.html

It appears reads 1 MiB has slowed down quite a bit for rocksdb compared to leveldb.

CMCDragonkai commented 2 years ago

The rocksdb's OptimisticTransactionDB is a class that extends StackableDB which extends DB. The current index.cpp is using just DB. Which means once we use OptimisticTransactionDB, this affects the entire js-db.

Furthermore rocksdb does not throw exceptions, they rely on the returned Status. This must be checked and then thrown as JS exception like we are doing right now.

It is possible to use the debugger on the native code when running with nodejs. You have to build a debug version of the native addon with node-gyp build --debug or node-gyp rebuild --debug.

The node-gyp-build loader actually prefers to use the release version rather than the debug version, so if both builds are available, the release version will be loaded in preference.

https://github.com/prebuild/node-gyp-build/blob/2e982977240368f8baed3975a0f3b048999af40e/index.js#L33-L39

The debug builds is quite heavy, it has to end up building all the dependencies with debug. And so the debug build is 130 MiB.

I reckon one should avoid deleting the release build when attempting to debug, and just move it somewhere else, and move it back in when you want to use the release build.

Once the debug build is available, one can use gdb on node. So it ends up doing the same thing with launch.json.

However with nodejs it's a bit smarter, since the debugger can auto-attach to any nodejs process, this means it's easy to debug nodejs code, because we are writing typescript, and upon using ts-node, the nodejs runs and the debugger is attached, and breakpoints... etc are used.

However the C/C++ seems to require a bit more configuration: https://code.visualstudio.com/docs/cpp/launch-json-reference, especially since we may be using ts-node in one case, jest in another case... The debugger has to attach to the node process.

It could also just work with: gdb --args /usr/local/bin/node app.js. Not sure if that triggers vscode's breakpoints.

I believe that we never used launch.json at all, and only relied on the automatic attach system that vscode has. This made it easier with zero configuration. It would be nice if gdb can also be simultaneously auto-launched/attach for a given program, in this case it would also be nodejs. One might even be able to run both debuggers at the same time, and so you can add breakpoints/logpoints in TS and C/C++ code.

CMCDragonkai commented 2 years ago

It appears that DB supports OpenForReadOnly, and this is a static member function, it is inherited by OptimisticTransactionDB, however the static function produces a DB and not a OptimisticTransactionDB.

I'm actually not sure about this. The compiler does complain about an invalid conversion from DB to OptimisticTransactionDB or OptimisticTransactionDB to DB.

Obviously we want to open an OptimisticTransactionDB, so it seems OpenForReadOnly won't work.

CMCDragonkai commented 2 years ago

So given there's no readonly mode, we can remove this readonly option.

CMCDragonkai commented 2 years ago

It appears changing over to the OptimisticTransactionDB is fully compatible with just normal DB operation.

The only major change is the additional method of BeginTransaction.

There's also additional options that we can put in during open: OptimisticTransactionDBOptions.

This is only available while also passing in:

  static Status Open(const DBOptions& db_options,
                     const OptimisticTransactionDBOptions& occ_options,
                     const std::string& dbname,
                     const std::vector<ColumnFamilyDescriptor>& column_families,
                     std::vector<ColumnFamilyHandle*>* handles,
                     OptimisticTransactionDB** dbptr);

The ColumnFamilyDescriptor and ColumnFamilyHandle.

But the options don't seem relevant to us:

struct OptimisticTransactionDBOptions {
  OccValidationPolicy validate_policy = OccValidationPolicy::kValidateParallel;

  // works only if validate_policy == OccValidationPolicy::kValidateParallel
  uint32_t occ_lock_buckets = (1 << 20);
};
CMCDragonkai commented 2 years ago

We will need to treat transactions similar to iterators for the struct Database in index.cpp.

For example:

  const leveldb::Snapshot* NewSnapshot () {
    return db_->GetSnapshot();
  }

  leveldb::Iterator* NewIterator (leveldb::ReadOptions* options) {
    return db_->NewIterator(*options);
  }

We'll add something like:

  rocksdb::Transaction* NewTransaction (rocksdb::WriteOptions* writeOptions, rocksdb::TransactionOptions* tranOptions) {
    return db_->BeginTransaction(*writeOptions, *tranOptions);
  }

Note that the transaction operations is optional, it is by default:

// Options to use when starting an Optimistic Transaction
struct OptimisticTransactionOptions {
  // Setting set_snapshot=true is the same as calling SetSnapshot().
  bool set_snapshot = false;

  // Should be set if the DB has a non-default comparator.
  // See comment in WriteBatchWithIndex constructor.
  const Comparator* cmp = BytewiseComparator();
};

The only relevant option for us would be the set_snapshot. I believe it doesn't really need to be set at all: https://github.com/facebook/rocksdb/wiki/Transactions.

Once I have the transaction I can get an iterator with GetIterator. It appears it will make use of a overlay uncommitted transaction state like I built in our current DBTransaction.

This should mean our DBTransaction class is likely to be quite simplified and rely instead on the C++ code.

CMCDragonkai commented 2 years ago

This will be useful too https://www.fluentcpp.com/2019/08/30/how-to-disable-a-warning-in-cpp/.

CMCDragonkai commented 2 years ago

Regarding the mixing of optimistic and pessimistic transactions, it seems this is not easily doable inside rocksdb. I've asked a question about this: https://groups.google.com/g/rocksdb/c/5v64dTxYKEw.

For now, I'll proceed with OptimisticTransactionDB, and if we need to do, serialisable locking could be done on the outside in JS-land, it would of course be subject to potential deadlocks and other possibilities until a lock manager can manage all of those.

CMCDragonkai commented 2 years ago

The index.cpp maintains its own "classes" with struct Database, struct Iterator... etc. I've created struct Transaction to act as the internal reference object that in JS-land, they must pass as a reference into rocksdbP.transaction_* calls.

Therefore there's multiple API layers:

  1. In JS-land we have rocksdbP and rocksdb module objects - this is what the JS exposes to be called
  2. In C++ we have transaction_open, transaction_commit, transaction_rollback (and more) as the top level calls.
  3. The top level static calls interact against the struct Transaction object.
  4. The struct Transaction object maintains an internal rocksdb::Transaction object, and performs calls against that internal object.
  5. It also calls into struct Database because the database object also keeps references to the transactions in order to close them when closing the database.

I find it interesting that the C++ land is mirroring the JS land. Do we really need so many object layers? This is how classic-level was already designed with its struct Iterator and struct Database.

I imagine that in C++ land, one could potentially do away with its own Transaction objects, and instead pass rocksdb::Transaction objects directly as references usable from JS-land, like why bother creating another object wrapper encapsulating rocksdb objects?

Possibly if the native addon itself has to complicated stuff that is not related to the internal rocksdb.

CMCDragonkai commented 2 years ago

Once we get around to creating iterators within the transaction, do we say that the Database still owns the Iterator, or is this now Database -> Transaction -> Iterator?

CMCDragonkai commented 2 years ago

Would be good to also use clang-tidy. Integration requires all the relevant include paths. Will need to see: https://github.com/mapbox/node-cpp-skel/pull/64. It requires the clang-tools package.

CMCDragonkai commented 2 years ago

Discovered a bug in iterator_close and therefore transaction_rollback and transaction_commit. This affected upstream too. Basically repeated calls to iterator_close would fail because the second call onwards is a noop call and does not call the callback. This causes the JS runtime to never continue executing.

The solution is to call the callback with a null indicating no error if we are conditionally calling the internal side effect.

I also checked the db_close if it had the same problem, and it doesn't. This is because it either queues the callback for execution, or assigns it as the database->pendingCloseWorker_ which will be executed at the end of each priority worker.

Note that most things are PriorityWorker except OpenWorker, CloseWorker, DestroyWorker, RepairWorker, CloseIteratorWorker, NextWorker.

I believe I should be making CommitTransactionWorker and RollbackTransactionWorker to be PriorityWorker as well. They seem to be important to be prioritised before the other workers if I'm interpreting this correctly.

CMCDragonkai commented 2 years ago

It seems the PriorityWorker is basically intended for async work to always take priority over closing the DB.

Then I wonder why things like CloseIteratorWorker and NextWorker are not priority workers.

CMCDragonkai commented 2 years ago

I've got transaction_init, transaction_commit, and transaction_rollback working.

I think it's a good idea to split up the C++ code, which is currently 2476 lines of code atm. It's getting difficult to constantly have 2/3 windows of the same file.

This should just be a matter of creating a few separate files, which are separately compiled and linked together. There will need to be headers to be included to allow forward declaration.

CMCDragonkai commented 2 years ago

The napi-macros package doesn't export the function names when using NAPI_EXPORT_FUNCTION, so I've just redefined it here. Probably in the future, can just switch to using node-addon-api instead which is better maintained.

Here's what I've done:

/**
 * NAPI_EXPORT_FUNCTION does not export the name of the function
 */
#undef NAPI_EXPORT_FUNCTION
#define NAPI_EXPORT_FUNCTION(name) \
  { \
    napi_value name##_fn; \
    NAPI_STATUS_THROWS_VOID(napi_create_function(env, #name, NAPI_AUTO_LENGTH, name, NULL, &name##_fn)) \
    NAPI_STATUS_THROWS_VOID(napi_set_named_property(env, exports, #name, name##_fn)) \
  }

This means the functions being exported will have the name property, and we can see in the stack trace what function is called.

Furthermore the promisify is extended to preserve the function name when wrapping:

  Object.defineProperty(g, 'name', { value: f.name });

This means promisify should be updated in PK as well @emmacasolin:

```ts /** * Convert callback-style to promise-style * If this is applied to overloaded function * it will only choose one of the function signatures to use */ function promisify< T extends Array, P extends Array, R extends T extends [] ? void : T extends [unknown] ? T[0] : T, >( f: (...args: [...params: P, callback: Callback]) => unknown, ): (...params: P) => Promise { // Uses a regular function so that `this` can be bound const g = function (...params: P): Promise { return new Promise((resolve, reject) => { const callback = (error, ...values) => { if (error != null) { return reject(error); } if (values.length === 0) { (resolve as () => void)(); } else if (values.length === 1) { resolve(values[0] as R); } else { resolve(values as R); } return; }; params.push(callback); f.apply(this, params); }); }; Object.defineProperty(g, 'name', { value: f.name }); return g; } ```
CMCDragonkai commented 2 years ago

In relation to https://github.com/facebook/jest/issues/12814 it turns out that when running in Jest, even exceptions like Error created by napi_create_error are not instances of Error.

This means any tests against the native addons will need to be aware that strict type checks won't work.

This might mean that in the source code where it checks if the e is an instanceof Error, it just can't be done, and instead we must only be checking the properties like code.

CMCDragonkai commented 2 years ago

Looks like I need some forward declarations here, right now there's an include loop between database.h, transaction.h and iterator.h.

I wonder though do these header need to declare forward declarations?

CMCDragonkai commented 2 years ago

The warning about the loglevel is a legitimate warning. It is possible that the log level configuration is wrong, and if so, that would result in a JS exception being thrown. however this doesn't prevent the C++ constructor code from continuing.

The ideal case is for the C++ constructor to throw an exception if it fails to construct for whatever reason. The problem here is that the binding.gyp and related files was inherited from rocksdb in level. In there, it seems there is configuration that disables the existence of C++ exceptions, if not linux, at least windows.

So for now since we don't want to experiment with C++ exceptions right now, we can do a couple things:

  1. Use a bare return to cancel computation, this means the object is still constructed, just rest of the code hasn't finished initialising. This creates a zombie object as explained here: https://stackoverflow.com/a/737656/582917.
  2. Use a named constructor or some static factory function to do the construction, then we can return some sort of status code.
  3. Make it so that if the log level configuration parameter is wrong, it just defaults to doing the default thing, which is if the log level size is not greater than 0, it just uses the null logger.
  4. Take the entire check on whether it is correct out of it and move it into db_open, thus throwing the JS exception there and not actually proceeding to construct the OpenWorker.

I'm going to try out solution 4.

CMCDragonkai commented 2 years ago

Solution 4. should be working now. We figure out the rocksdb::InfoLogLevel and rocksdb::Logger ahead in db_open first, and then we pass that into the OpenWorker.

Note that the setting of the NullLogger only occurs when the infoLogLevel option is empty. Otherwise, it leaves it as whatever it is defaulted to.

CMCDragonkai commented 2 years ago

No more major warnings. And the C++ code is no split out.

CMCDragonkai commented 2 years ago

All tests pass on the split out code.

Also if an invlid log level will be used, then we get an JS exception in the callback with code DB_OPEN and the message Invalid log level.

CMCDragonkai commented 2 years ago

Now to add in the rest of the transaction functions.

CMCDragonkai commented 2 years ago

I have the basic transaction methods, it's time to refactor DBTransaction to support those methods, and reactivate some simple tests involving DBTransaction.

CMCDragonkai commented 2 years ago

The rocksdb's OptimisticTransactionDB is a class that extends StackableDB which extends DB. The current index.cpp is using just DB. Which means once we use OptimisticTransactionDB, this affects the entire js-db.

Furthermore rocksdb does not throw exceptions, they rely on the returned Status. This must be checked and then thrown as JS exception like we are doing right now.

It is possible to use the debugger on the native code when running with nodejs. You have to build a debug version of the native addon with node-gyp build --debug or node-gyp rebuild --debug.

The node-gyp-build loader actually prefers to use the release version rather than the debug version, so if both builds are available, the release version will be loaded in preference.

https://github.com/prebuild/node-gyp-build/blob/2e982977240368f8baed3975a0f3b048999af40e/index.js#L33-L39

The debug builds is quite heavy, it has to end up building all the dependencies with debug. And so the debug build is 130 MiB.

I reckon one should avoid deleting the release build when attempting to debug, and just move it somewhere else, and move it back in when you want to use the release build.

Once the debug build is available, one can use gdb on node. So it ends up doing the same thing with launch.json.

However with nodejs it's a bit smarter, since the debugger can auto-attach to any nodejs process, this means it's easy to debug nodejs code, because we are writing typescript, and upon using ts-node, the nodejs runs and the debugger is attached, and breakpoints... etc are used.

However the C/C++ seems to require a bit more configuration: https://code.visualstudio.com/docs/cpp/launch-json-reference, especially since we may be using ts-node in one case, jest in another case... The debugger has to attach to the node process.

It could also just work with: gdb --args /usr/local/bin/node app.js. Not sure if that triggers vscode's breakpoints.

I believe that we never used launch.json at all, and only relied on the automatic attach system that vscode has. This made it easier with zero configuration. It would be nice if gdb can also be simultaneously auto-launched/attach for a given program, in this case it would also be nodejs. One might even be able to run both debuggers at the same time, and so you can add breakpoints/logpoints in TS and C/C++ code.

I had to just try this.

So the issue was that there was a segfault in the C++ code occurring in the Transaction::Put method call.

I narrowed it down to:

  rocksdb::Status s = dbTransaction_->Put(key, value);

But I couldn't see what the segfault was from.

First thing is to delete the build directory so that the old Release was removed, because node-gyp prefers using the Release instead of Debug build. I believe we should change this up by making the node-gyp-build prefer the debug build first, since this allows us to debug more easily, and then just delete the debug build when we're ready.

Then use node-gyp rebuild --debug we are rebuilding the debug binary.

Now the issue gets a bit complicated, normally we are using ts-node to run things, but ts-node isn't the binary we want to use gdb on. It's the node binary that we want to run it on. So for now I don't know how to use gdp on the ts-node, so I first just use tsc to compile all of the code like rimraf ./dist && tsc -p ./tsconfig.build.json.

Then I can write a simple JS script that I want to check, in this case a prototyping script test-rocksdb.js.

Finally I could use gdb node, and inside the GDB shell, we could use run ./test-rocksdb.js.

In this case I'm not even using a breakpoint at all. But I did have https://github.com/cyrus-and/gdb-dashboard integrated into my system.

What it showed me was that the this property in the stack was 0x0. That's a null pointer.

Subsequently confirming it by checking this against nullptr, also revealed that the this was null.

image

Note that subsequent runs/builds doesn't always show this, sometimes it ended up being 0x3. At any case, the this is not what it should be. It should be a pointer to the Transaction object.

image

That's when I checked the transaction_workers.cpp and noticed that I forgot to instantiate the tran_ in the other transaction workers that I created. Because TransactionCommitWorker and TransactionRollbackWorker had it properly setup.

Not instantiating properties means that they get instantiated to null pointer or possibly some undefined behaviour.

The usage of C++ constructor initialiser list is definitely error prone, since it has to be in-order of the declarations of the struct/class, and vscode doesn't tell you anything about the pointer not being initialised, because there are legitimate uses for doing so.

CMCDragonkai commented 2 years ago

For further intensive dev on C/C++, definitely need to figure out how to integrate gdb into the vscode system while automatically compiling the typescript first. And probably integrate clang format into it too.

CMCDragonkai commented 2 years ago

Important to be aware that the rocksdb slices can be converted to C++ strings with key.toString(). To convert to C strings, further c_str() can be used. So key.toString().c_str(). Important if you want to use fprintf. But in C++, std::cerr is more idiomatic.

  std::cerr << key.ToString().c_str() << std::endl;
  std::cerr << value.ToString() << std::endl;
CMCDragonkai commented 2 years ago

The clang-tidy does some semantic checks like. This is how you use it:

clang-tidy \
  --checks='*' \
  ./src/rocksdb/napi/worker.cpp \
  -- \
  -isystem /nix/store/zl4bvsqfxyx5vn9bbhnrmbmpfvzqj4gd-nodejs-16.14.2/include/node \
  -isystem /nix/store/hym1n0ygqp9wcm7pxn4sfrql3fg7xa09-python3-3.9.12/include \
  -isystem ./deps/rocksdb/rocksdb/include \
  -isystem ./node_modules/napi-macros

It can make use of .clang-tidy configuration file as well, but I haven't figured that out yet.

Note the usage of include/node. This is because we are currently including node-api.h, but not node/node-api.h. It seems node-gyp understands to go into the node directory too.

There are alot of checks and not all are relevant, so we would need to create our own standard like we do with .eslintrc.

Because clang-tidy requires compilation flags, later we can just use $NIX_CFLAGS_COMPILE.

More information here: https://medium.com/@inflatonn/better-coding-practices-for-your-organisation-using-clang-tidy-and-clang-format-d736864b96e2

CMCDragonkai commented 2 years ago

Seems like to be compatible with the rest of the system, it would make sense to use #include<node/node_api.h> instead of just #include<node_api.h>. We can try and see what happens if it compiles well on the other platforms too.

CMCDragonkai commented 2 years ago

We now have npm run lintnative and npm run lintnativefix. I disabled the include sorting because it's not able to know whether something is an external library atm, the clang-format is configured in .clang-format. The clang-tidy could be further improved like:

clang-tidy \
  --checks='*' \
  ./src/rocksdb/napi/worker.cpp \
  -- \
  $NIX_CFLAGS_COMPILE \
  -isystem ./deps/rocksdb/rocksdb/include \
  -isystem ./node_modules/napi-macros

Note that globbing doesn't work nicely in npm run scripts, so you must use find instead. The main problem is that compilation flags are necessary, and so node-gyp right now has all the configuration and configuring clang-tidy is extra work.

Anyway having clang-format working is great for now.

It might be better to rely on https://devblogs.microsoft.com/cppblog/visual-studio-code-c-december-2021-update-clang-tidy/

CMCDragonkai commented 2 years ago

I've tested the transactionGetForUpdate, and this now works to solve write skew anomalies pretty easily. Basically it promotes reads into a same-value write. This explains my earlier question about the difference between software transactional memory and databases. STM seems to rely on value equality, but databases use logical timestamps instead. So writing the same value just means updating a logical timestamp. Databases have to deal with more sophisticated data, whereas STM can get away with a simpler model.

CMCDragonkai commented 2 years ago

We can now proceed to bring in iterators for the transactions, and get onto fixing up DBTransaction.

CMCDragonkai commented 2 years ago

In bringing in first class snapshots, we had to create Snapshot class that encapsulates rocksdb::Snapshot. The reason was the same reason for iterator/transaction, object lifecycle management, if we were to return a reference to a C++ object in a native method, JS expects them to be garbage collected as soon as the scope is finished. Because the object must live beyond the scope of the native method, we have to keep track of it.

This is done by incrementing the reference count by 1, but then also keeping a reference to it in the Database. This way now if the Database closes, it will end up releasing all of these snapshots.

Note that there are several ways the Database closes:

  1. Directly through the dbClose native method, this does it asynchronously. But it also uses the snapshot_release_do.
  2. By allowing the napi_env to expire or become invalid, which is triggered through a cleanup hook. This happens when the node runtime itself is closing. This ends up calling env_cleanup_hook. This does the cleanup functionality synchronously.

Now there's something additional one has to be concerned about when calling dbClose. Because it is asynchronous, the codebase tries to currently schedule these closing operations after any priority operations that is still running.

This is done through the PriorityWorker and a reference count on Database called Database::ref_, and with the count value put into Database::priorityWork_ property.

So if there is priority work to do, the closing operation is registered under Database::pendingCloseWorker_. This ends up being called when the reference count reaches 0.

This explains why CloseWorker is extending BaseWorker, and not PriorityWorker.

Now we have a similar situation for transactions. If we proceed to call dbClose() while there are still asynchronous operations in the transactions, we should be scheduling the transaction rollbacks after any priority work that's being done in the transaction.

This is what's still TBD in transaction_rollback_do, as well as transactionCommit.

  // TODO:
  // if other async ops, delay this operation

To solve this, we should be adding a reference count of priority work in our Transaction. Then the PriorityWorker that comes out of Transaction should be decrementing the count for Transaction, not just the Database.

For iterator this was done differently. Instead a DoFinally was registered for all the workers in the iterator, and so iteratorClose is always scheduled after any asynchronous concurrent NextWorker. This was easy to do because there was only 1 other asynchronous method.

For transaction it would be a bit more tedious to implement.

The reason is that the dbClose currently just immediately deletes the database if there is no priority work. Priority work isn't just whenever there's a PriorityWorker, it's also when there are Transaction, Snapshot or Iterator still live. We could create a second constructor for PriorityWorker that increments & decrements priority work for Transaction instead.

CMCDragonkai commented 2 years ago

The new Snapshot class will when constructed with Transaction will do:

Snapshot::Snapshot(Transaction* transaction, const uint32_t id)
    : database_(transaction->database_),
      id_(id),
      isReleasing_(false),
      hasReleased_(false),
      ref_(NULL) {
  // This ensures that the transaction has consistent writes
  transaction->SetSnapshot();
  // Use this snapshot to get consistent reads
  dbSnapshot_ = transaction->GetSnapshot();
}

This ends up creating a consistent read and write when snapshots are used.

One just needs to call transactionSnapshotInit like:

const snapshot = await transactionSnapshotInit(tran);

The DB has its own called dbSnapshotInit.

At the end, the snapshot should be released with snapshotRelease.

// notice this is asynchronous
await snapshotRelease(snapshot);
CMCDragonkai commented 2 years ago

Transactions have their own priority count now. However I think I broke something there... need to verify that the ref counts make sense.

CMCDragonkai commented 2 years ago

Transaction ref counts appear to be working.

CMCDragonkai commented 2 years ago

I have a test for snapshots now, but I need to figure out what happens if you only call transactionSnapshotInit(). Because it does give you back a snapshot object, you can't just call it without dealing with it, otherwise resource leak.

However also the snapshot itself can't actually be released until after the transaction is committed or rollbacked. I would argue that if it is doing this, the snapshots should also be invalidated. But right now snapshot release has nothing to do with the transaction, but instead done as part of the database shutdown.

So one must first transactionCommit and then snapshotRelease the snapshot if it was used. This is based on the theory that the snapshot is needed to be cleaned up after the transaction where the snapshot was acquired is still live.

Perhaps one can release the snapshot before the transaction commits, but it doesn't affect the commit itself.

CMCDragonkai commented 2 years ago

It's definitely kind of limiting not being able to use C++ exceptions, it makes it difficult to write procedures that terminate execution thread and have it bubble upwards. If we write functions we then need to check if those functions returned true, but this goes against my ideal writing procedures and functions separately so that it is clear that the former performs side-effects, and the latter doesn't.

So for now I've settled on macros.

Basically the issue is that we should be checking whether the transaction is ready to use. If it has already been committed or rollbacked, it should be usable afterwards, without an appropriate check, we end up with a segfault.

It seems common to reach a segfault whenver something that shouldn't be used is being used. Basically null pointer exception, which would be obvious in a managed language, whereas in C++ we don't see it at all, it's just a segfault.

So for now the C++ code isn't the best well abstracted atm, it would require more conditional checking and some further abstractions to clean up in the future, especially checking for valid states where something can be performed.

If we are to improve the C++ codebase, we should be using node-addon-api for more sophisticated work. Not having exceptions is like trying to write in C, and C is just quite limited in its capabilities.

And yes, it does appear C, C++ (https://stackoverflow.com/a/32127664/582917), and Rust (Result) all prefer using something other than exceptions for high performance error control flow. But if it's ultimately connected to JS control flow, I think it's fine to use exceptions.

CMCDragonkai commented 2 years ago

It appears that after committing the transaction, attempting to release the snapshot causes a segfault. I ran gdb over it, but it just steps into the instruction delete casted_s and results in Cannot access memory at address 0x0. I suspect this may simply due to the fact that a transaction's snapshot doesn't need to be released if already released through committing/rollbacking the transaction.

CMCDragonkai commented 2 years ago

Now if I don't release the snapshot, the act of closing the database causes it to attempt to release the snapshot. So I suspect, perhaps if we commit/rollback the transaction we must not attempt to release that snapshot.

Actually it turns out, that if you were to release the snapshot earlier, this causes a bug in the transaction when it later attempts to close.

I think you never manually release the snapshot if the snapshot was created from the transaction. Because then the snapshot is part of the lifecycle of the transaction. Hence why it's called Transaction::GetSnapshot.

All the example code shows just setting the snapshot pointer to nullptr after the transaction is completed.

CMCDragonkai commented 2 years ago

Anyway to solve the problem, we just have to ensure not to release the snapshot early. Or if we are releasing the snapshot early, we would then have to also probably clear the snapshot from the transaction before committing or rollbacking it.

So we should make snapshotRelease a dbSnapshotRelease instead, since it cannot be applied to snapshots acquired from a transaction.

If it is applied, we should add a function that throws an error as part of the release. The transaction snapshot is only applicable to the read calls. But any snapshot that is no longer relevant should be switched with a boolean, to avoid callers from re-using the snapshot.

CMCDragonkai commented 2 years ago

You have to set the snapshot before getting it, if you get it before on the transaction, it will give back a null pointer.

CMCDragonkai commented 2 years ago

We now have RocksDBTransactionSnapshot and RocksDBSnapshot 2 separate types, the former only for transaction snapshots, the latter exposing database snapshots. Don't mix them up!