Closed CMCDragonkai closed 2 years ago
In order to allow some streaming based functionality, we have to allow the creation of iterators inside the transaction.
We will preserve the read-committed
isolation level, which means we create iterators against the DB inside the transaction, and each iterator then needs to operate against the snapshot and the real DB.
This means our snapshot needs to become an ordered key value structure.
This means we are creating another temporary DB to be the snapshot. However I'm going to first just try with a sublevel.
This would mean that there is a reserved area in the DB where transactions are working against, and this sublevel should never be used by any other places.
The usage of the snapshot
sublevel does impact any usage of streaming on the root DB though, because using would iterate over the sublevel snapshots too.
Furthermore we cannot just use snapshot sublevel, we would need to create a new sublevel for each snapshot that is bound to a transaction.
Alternatively we forego using a snapshot level entirely and just use an in-memory ordered map structure, but that is its own can of worms.
Most efficient would be:
data
and snapshots
as the 2 root sublevels.These additional "root" sublevels can be extended to deal with automatic indexing #1 which would then supersede #2. Which would give us:
!data
!snapshots
!index
Then index sublevels can be accessed separately and managed separately.
Additional relevant issue https://github.com/Level/subleveldown/issues/111
Also I noticed that a abstract-level
is being created and that may evolve the leveldb system to be more sophisticated. In particular supporting Uint8Array https://github.com/Level/abstract-level
The https://github.com/Level/abstract-level might be a good foundation for a new DB
, which would then still use leveldown "now known as classic-level
" as the node underlying database.
I've rolled up a bunch of issues in js-db into larger epic https://github.com/MatrixAI/js-db/issues/11 that would be one big change to DB, that would be breaking relative to EFS and PK. So this PR will focus on just integrating withF
and withG
and the ability to use iterator
inside a transaction. However this may be removing the transact
and withLocks
method which does break EFS, but that is an easy fix, since EFS doesn't actually use any of the DB locking, so it can just have withF
integrated with the resource types.
The DBTransaction
now uses transactionDb
instead of snapshot. The tran.iterator(['d1', 'd2'])
can take DBDomain
, this makes our iterator
method a bit different from leveldb's iterator
method. This is because our tran methods tend to operate from the root, and that's how the get
, and put
works. Remember that since levels are not using the DB
interface, we had to do this.
Furthermore the transactionDb
is encrypted because it is on disk.
The iterator
method is a bit strange, I need to test what happens when the keys are different between transactionDb
and the dataDb
it is operating over... And the usage of seek
. Further prototyping required.
The changes to DB
has impacted some tests.
✓ async construction constructs the filesystem state (13 ms)
✓ async destruction removes filesystem state (3 ms)
✓ async start and stop preserves state (18 ms)
✓ async start and stop preserves state without crypto (4 ms)
✓ async start and stop requires recreation of db levels (6 ms)
✓ creating fresh db (5 ms)
✓ get and put and del (11 ms)
✓ batch put and del (13 ms)
✕ db levels are leveldbs (6 ms)
✓ db levels are just ephemeral abstractions (4 ms)
✓ db levels are facilitated by key prefixes (12 ms)
✓ clearing a db level clears all sublevels (10 ms)
✕ lexicographic iteration order (9 ms)
✕ lexicographic buffer iteration order (112 ms)
✕ lexicographic integer iteration (7 ms)
✓ db level lexicographic iteration (10 ms)
✓ get and put and del on string and buffer keys (6 ms)
✕ streams can be consumed with promises (11 ms)
✓ counting sublevels (15 ms)
✓ parallelized get and put and del (428 ms)
✓ parallelized batch put and del (464 ms)
✓ works without crypto (6 ms)
I'm creating Transaction.test.ts
now to test different usages of transaction and prototype the iterator usage.
I have a working prototype for the tran.iterator
.
It is able to handle this sort of data:
KEYS | DB | SNAPSHOT | RESULT |
---|---|---|---|
a | a = a | a = 1 | a = 1 |
b | b = b | b = b | |
c | c = 3 | c = 3 | |
d | d = d | d = d | |
e | e = e | e = 5 | e = 5 |
f | f = 6 | f = 6 | |
g | |||
h | h = h | h = h | |
i | |||
j | j = 10 | j = 10 | |
k | k = k | k = 11 | k = 11 |
Tests have been done for the transactional iterator.
Now it's time to clean up the tests for the DB.test.ts
, and we can start to merge this and apply it to the NodeGraph
.
In this PR, we are removing the withLocks
and other async-mutex
usages.
Having locks in the DB muddles the separation of responsibilities.
It turns out our read-committed transactions doesn't actually need locking at all.
You only need locks if you want to also prevent:
However this all depends on the end-user and their specific situation when using transactions. Sometimes they may use only simple mutexes, in other cases they need to use read/write locks, OCC, PCC... etc.
So therefore locking functionality is removed from the DB.
We are however using withF
and withG
here. And we may import this from: https://github.com/MatrixAI/js-resources
We now have tests for each isolation property:
[nix-shell:~/Projects/js-db]$ npm test -- ./tests/Transaction.test.ts
> @matrixai/db@1.2.1 test /home/cmcdragonkai/Projects/js-db
> jest "./tests/Transaction.test.ts"
Determining test suites to run...
GLOBAL SETUP
PASS tests/Transaction.test.ts
Transaction
✓ snapshot state is cleared after releasing transactions (34 ms)
✓ get, put and del (17 ms)
✓ no dirty reads (17 ms)
✓ non-repeatable reads (9 ms)
✓ phantom reads (11 ms)
✓ lost updates (7 ms)
✓ iterator with same largest key (16 ms)
✓ iterator with same largest key in reverse (15 ms)
✓ iterator with snapshot largest key (12 ms)
✓ iterator with snapshot largest key in reverse (12 ms)
✓ iterator with db largest key (12 ms)
✓ iterator with db largest key in reverse (10 ms)
✓ iterator with undefined values (20 ms)
✓ iterator using seek and next (19 ms)
✓ queue success hooks (5 ms)
✓ queue failure hooks (7 ms)
✓ rollback on error (7 ms)
Test Suites: 1 passed, 1 total
Tests: 17 passed, 17 total
Snapshots: 0 total
Time: 1.003 s, estimated 3 s
Ran all test suites matching /.\/tests\/Transaction.test.ts/i.
GLOBAL TEARDOWN
And matches this table:
The db.db
should no longer be used when attempting to access the internal DB at root.
You can use db.dataDb
.
However that shouldn't be necessary either.
We now expose:
DB.clear()
- use this to clear the dataDb
DB.dump()
- use this to dump the dataDb
DB.iterator()
- use this to get an iterator on dataDB
DB.count()
- use this to count from dataDB
And with that, the DB.test.ts
now passes.
One issue is the difference in API between DB
and Transaction
for some methods like clear
, iterator
, dump
and count
.
And the fact that I'm only supporting promise-API, like for iterator
. So callback style won't work.
The DBTransaction
interface is removed. In place, I'm going to rename Transaction
class to DBTransaction
. This is just because Transaction
is quite an overloaded name, so this just ensures that when importing like import { DB, DBTransaction }
avoids less renaming.
Rather than using the AbstractIterator
, both DB.iterator
and DBTransaction.iterator
now both return DBIterator
. This will be used as a stopgap until #11 is done.
The APIs between DB
and DBTransaction
is still different because we want to eventually move to where you can just specify a key path instead of directly maintaining particular sublevel objects. Once all methods can just take a full keypath, there's no need to manage a bunch of sublevel objects, and that should reduce the object maintenance overhead.
This means later in #11, we should have:
// root iterator at dataDb
DB.iterator(options);
// same
DB.iterator(options, []);
// down level 1
DB.iterator(options, ['level1']);
DB.iterator(options, ['level1', 'level2]);
DBTransaction.iterator(options, ['level1', 'level2']);
DB.count();
DB.count(['level1', 'level2']);
DBTransaction.count();
DBTransaction.count(['level1']);
DB.clear();
DB.clear(['level1']);
DBTransaction.clear(['level1']); // this should be a transactional clear, to do this, we must iterator and then do `db.del`
DB.get('key');
DB.get(['level1', 'key']);
// these are not allowed, or they are equivalent to asking for undefined
DB.get([]);
DB.get();
DB.put('key', 'value');
DB.put(['level1', 'key'], 'value');
DB.del('key');
DB.del(['level1', 'key']);
DBTransaction.del('key');
DBTransaction.del(['level1', 'key']);
Still need to add DBTransaction.clear
method to be a transactional clear that iterates over a sublevel and deletes all the entries.
Ok this is ready to merge.
Description
The
withF
andwithG
generalises resource acquisition and release. The DB transaction is one of those resources. Since we share a DB being used in PK, having these methods would be useful.Long term, the
withF
andwithG
types and utility functions would be extracted into its own package likejs-context
so it can be shared.Over time this should replace the
DB.transact
andDB.withLocks
methods, and instead just provide theDB.transaction
method.The transactional iterator implementation follows the iterator on LevelDB including the exception behaviour like in
leveldown
andabstract-leveldown
packages. This is a stop-gap solution until we migrate the whole DB implementation overabstract-level
#11.Downstream packages EFS and PK will need to be updated.
Issues Fixed
Tasks
withF
andwithG
) by integratingResourceAcquire
andResourceRelease
types into aDB.transaction
method.[x] 2. Add an
iterator
method intoTransaction
andDBTransaction
interface so that we can perform transactional iteration. This will still use theAbstractIterator
type, which is not currently up to date in upstream which will necessitate the usage of@ts-ignore
comments. See: https://github.com/DefinitelyTyped/DefinitelyTyped/discussions/59265This iterator must act like a "overlay", in that writes in the transaction take priority over data that is currently in the DB.
DB.withLocks
andDB.transact
as this uses locking code, and locks are not necessary for our read-committed isolation level transactions, advisory-locks for transactions are now up to the user to manage.data
andtransactions
and eventuallyindex
to support temporary state, so that users ofDB
will only be storing data underdata
sublevel.db.db
shouldn't be used anymore, as the root of data is now indb.dataDb
, thedb.db
is the still the underlying root LevelDB database, but this shouldn't be used unless you're doing something hacky on the LevelDB database.withF
andwithG
andResourceAcquire
andResourceRelease
types into@matrixai/resources
package, which will be used by PK as well and subsequently EFS.Transaction.test.ts
to formally test theTransaction
class, and formalise the 4 isolation-levels vs read-phenomena properties into the tests.DBTransaction
interface, and renamedTransaction
class toDBTransaction
, this just simplifies the typesAbstractIterator
in favour of our ownDBIterator
since the upstream type is out of date, and this avoids having users having to do// @ts-ignore
all the time.Final checklist