Closed tbaumann closed 9 months ago
I would suggest that most of this probably does belong at the application level. We also build this same type of functionality on top of lmdb-js, so it definitely works well to do that. At some point it might be nice to have "deleted" entry awareness in lmdb-js for record counts, but we have the same exact concept in the db software we build. I will also mention the txnId
is available in the aftercommit
listener event.
Thanks for the hints. That will help me greatly.
We also build this same type of functionality on top of lmdb-js, so it definitely works well to do that.
Is that open source by any chance? I would love to pillage the sources a bit. :D
I finally got time to play with this. The message packing and unpacking seems relatively straight forward.
But the event listeners don't seem to work. (Not sure if I should open a new ticket)
const lmdb = require('lmdb');
const db = lmdb.open({ path: 'test_db' });
let token_db = db.openDB('tokens');
db.on('aftercommit', ({ next, last, txnId }) => {
console.log("aftercommit", txnId)
});
db.on("beforecommit", (...args) => {
const parameters = args.join(', ');
console.log(`beforecommit event with parameters ${parameters}`);
});
token_db.on('aftercommit', ({ next, last, txnId }) => {
console.log("aftercommit", txnId)
});
token_db.on("beforecommit", (...args) => {
const parameters = args.join(', ');
console.log(`beforecommit event with parameters ${parameters}`);
});
token_db.putSync("test","test")
token_db.get("test");
console.log("events: ", db.eventNames());
console.log("events: ", token_db.eventNames());
outputs
events: []
events: []
Also, I need the txnId before I write out data. With aftercommit
I see only a way to know the txnId after the first commit.
Is there a way to access the highest txnId before a commit is made?
The commit events are only for asynchronous transactions, not synchronous as in the example above. I suppose they could be triggered for synchronous transactions, but doesn't seem that helpful since they are already explicitly started/stopped in the main thread. Instead I added a getWriteTxnId
method (in the referenced commit) that can be used to get the transaction id for explicit transaction callbacks.
Cool, that's very useful. Thanks a lot.
I was thinking of wrapping my code into transactions even though it's not really useful for me. But then realised that I have no way of accessing the txnID of the ongoing transaction so there was no benefit to that anyway. Really cool to be able to do it like this now
I struggle to use the encoding Module to implement this, mostly because I have to re-use existing headers when updating entries. (I'm not allowed to drop unused flags and header extensions. But that context would be lost.)
Is there something like putBinary() complementary to getBinary()? Because passing a Buffer into put() put's a prefix before the data.
I can of course use a null Encoder, but putBinary() would be a bit more clear in the code.
Yes, I think you want to use db.put(key, asBinary(alreadyEncodedBuffer))
(asBinary
is an export of lmdb
).
Yes, I think you want to use
db.put(key, asBinary(alreadyEncodedBuffer))
(asBinary
is an export oflmdb
).
Oh how embarrassing. I was convinced asBinary() put a two byte header in front of my data. I did a lot of testing with zero buffers and got spurious data in the front. But today with coffee and more sleep I see it's working as intended. :facepalm:
Sorry to waste your time
For anyone coming here by google
// Header offsets
// https://github.com/PowerDNS/lightningstream/blob/main/docs/schema-native.md
const LS_HEADER_SIZE = 24;
const LS_HEADER_POS_TIMESTAMP = 0;
const LS_HEADER_POS_TXN = 8;
const LS_HEADER_POS_SCHEMA = 16;
const LS_HEADER_POS_FLAGS = 17;
const LS_HEADER_POS_EXTENSION_COUNT = 22;
const LS_EXTENSION_HEADER_SIZE = 8;
const LS_FLAGS_DELETED = 1 << 0;
class LSData {
#header;
#value;
constructor(data) {
if (data) {
// Parse a buffer
this.unpack(data);
} else {
// Init an empty object
this.#header = Buffer.alloc(LS_HEADER_SIZE);
}
return this;
}
unpack(data) {
let extension_headers = data.readInt8(LS_HEADER_POS_EXTENSION_COUNT);
let header_len =
LS_HEADER_SIZE + extension_headers * LS_EXTENSION_HEADER_SIZE;
let header = data.subarray(0, header_len);
let buf = data.subarray(header.length);
if(header.readInt8(LS_HEADER_POS_SCHEMA) != 0){
throw Error("Schema version of Lightning header is not 0. Not allowed!")
}
this.#header = header;
if (buf.length > 0) {
this.value = msgpackr.unpack(buf);
} else {
this.value = undefined;
}
return this;
}
asBuffer(txnID) {
this.#header.writeBigInt64BE(BigInt(txnID), LS_HEADER_POS_TXN);
this.#header.writeBigInt64BE(process.hrtime.bigint(), LS_HEADER_POS_TIMESTAMP);
let buf = [this.#header];
if (!this.deleted) {
buf.push(msgpackr.pack(this.#value)); // Only write data if it was set
}
return Buffer.concat(buf);
}
get deleted() {
let flags = this.#header.readInt8(LS_HEADER_POS_FLAGS);
let value = (flags & LS_FLAGS_DELETED) != 0;
return value;
}
set deleted(val) {
let delflag = val ? LS_FLAGS_DELETED : 0;
let flags = this.#header.readInt8(LS_HEADER_POS_FLAGS);
flags = (flags & LS_FLAGS_DELETED) | delflag;
this.#header.writeInt8(flags, LS_HEADER_POS_FLAGS);
}
set value(val){
this.deleted = false;
this.#value = val
}
get value(){
return this.#value;
}
}
// Store
return await db.transactionSync(() => {
let data;
if (db.doesExist(key)) {
data = db.getBinary(key);
}
let ls_entry = new LSData(data);
ls_entry.deleted = false;
ls_entry.value = {YOUR STUFF};
return token_db.put(key, asBinary(ls_entry.asBuffer(db.getWriteTxnId())));
});
// Retrieve
let data = db.getBinary(key);
if (!data) {
return null; // No entry
}
let ls_entry = new LSData(data);
if (ls_entry.deleted) {
return null; // Entry is marked as deleted
}
let value = ls_entry.value;
// Delete
return await db.transactionSync(() => {
let data;
if (db.doesExist(key)) {
data = db.getBinary(key);
}
let ls_entry = new LSData(data);
ls_entry.deleted = true;
return db.put(key, asBinary(ls_entry.asBuffer( db.getWriteTxnId())));
});
If you don't mind I pop this one open again. :smile:
So I have a db that I can read and write, and I think it's lightningstream compatible data.
But lightningstream can't even open my db.
time="2024-02-14T11:22:35Z" level=info msg="PID satisfies minimum" minimum_pid=50 pid=59
time="2024-02-14T11:22:38Z" level=info msg="Storage backend initialised" storage_type=s3
time="2024-02-14T11:22:38Z" level=info msg="[main ] Opening LMDB" db=main lmdbpath=/lmdb/instance-1/db
time="2024-02-14T11:22:38Z" level=info msg="[main ] Env info" LastTxnID=0 MapSize="1024.0 MB" db=main
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=main_storage_store
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for startup phase" starttracker=main
time="2024-02-14T11:22:38Z" level=info msg="[main ] schema_tracks_changes enabled" db=main instance=instance-1
time="2024-02-14T11:22:38Z" level=info msg="[main ] Initialised syncer" db=main instance=instance-1
time="2024-02-14T11:22:38Z" level=info msg="[shard ] Opening LMDB" db=shard lmdbpath=/lmdb/instance-1/db-0
time="2024-02-14T11:22:38Z" level=info msg="[main ] Enabled LMDB stats logging" db=main instance=instance-1 interval=30m0s
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=main_storage_list
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=main_storage_load
time="2024-02-14T11:22:38Z" level=info msg="[shard ] Env info" LastTxnID=0 MapSize="1024.0 MB" db=shard
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=shard_storage_store
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for startup phase" starttracker=shard
time="2024-02-14T11:22:38Z" level=info msg="[shard ] schema_tracks_changes enabled" db=shard instance=instance-1
time="2024-02-14T11:22:38Z" level=info msg="[shard ] Initialised syncer" db=shard instance=instance-1
time="2024-02-14T11:22:38Z" level=info msg="[authdb ] Opening LMDB" db=authdb lmdbpath=/lmdb/instance-1/authdb
time="2024-02-14T11:22:38Z" level=info msg="[shard ] Enabled LMDB stats logging" db=shard instance=instance-1 interval=30m0s
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=shard_storage_list
time="2024-02-14T11:22:38Z" level=info msg="registered tracker for failure duration" healthtracker=shard_storage_load
time="2024-02-14T11:22:38Z" level=fatal msg=Error error="lmdb env: open: mdb_env_open: MDB_INVALID: File is not an LMDB file"
time="2024-02-14T11:22:38Z" level=warning msg="Exiting with exit code" exitcode=1 pid=1
The first db's are the one from pdns. authdb
is mine.
It's a big mess.
The lmdb file created from lmdb-js on the node:current-slim
creates a file that supposedly isn't even a db.
[nix-shell:~/git/lightningstream]$ sudo mdb_stat -a -n -e -f -r /home/tilli/.local/share/containers/storage/volumes/lightningstream_lmdb/_data/instance-1/authdb
mdb_env_open failed, error -30793 MDB_INVALID: File is not an LMDB file
[nix-shell:~/git/lightningstream]$ sudo mdb_stat -a -n -e -f -r /home/tilli/.local/share/containers/storage/volumes/lightningstream_lmdb/_data/instance-1/db
Environment Info
Map address: (nil)
Map size: 1073741824
Page size: 4096
Max pages: 262144
Number of pages used: 2
Last transaction ID: 0
Max readers: 126
Number of readers used: 0
Reader Table Status
(no active readers)
Freelist Status
Tree depth: 0
Branch pages: 0
Leaf pages: 0
Overflow pages: 0
Entries: 0
Free pages: 0
Status of Main DB
Tree depth: 0
Branch pages: 0
Leaf pages: 0
Overflow pages: 0
Entries: 0
The node instance on my dev system is subtly different.
$ mdb_stat -n lmdb_ls_native_db
mdb_env_open failed, error -30794 MDB_VERSION_MISMATCH: Database environment version mismatch
$ mdb_stat -V
LMDB 0.9.31: (July 10, 2023)
I feel suddenly transported in the 90's. Does LMDB really have incompatible binary versions?
PS: Apparently lithiningstream is unhappy if you don't use noSubdir: true
I tried with --use_data_v1=true
(Based on node:buster base image so all the tools are there)
npm install lmdb --build-from-source --use_data_v1=true
npm notice
npm notice New minor version of npm available! 10.1.0 -> 10.4.0
npm notice Changelog: <https://github.com/npm/cli/releases/tag/v10.4.0>
npm notice Run `npm install -g npm@10.4.0` to update!
npm notice
npm ERR! code 1
npm ERR! path /user/src/myapp/node_modules/lmdb
npm ERR! command failed
npm ERR! command sh -c node-gyp-build-optional-packages
npm ERR! make: Entering directory '/user/src/myapp/node_modules/lmdb/build'
npm ERR! CXX(target) Release/obj.target/lmdb/src/lmdb-js.o
npm ERR! CC(target) Release/obj.target/lmdb/dependencies/lmdb/libraries/liblmdb/midl.o
npm ERR! CC(target) Release/obj.target/lmdb/dependencies/lmdb/libraries/liblmdb/chacha8.o
npm ERR! CC(target) Release/obj.target/lmdb/dependencies/lz4/lib/lz4.o
npm ERR! CXX(target) Release/obj.target/lmdb/src/writer.o
npm ERR! make: Leaving directory '/user/src/myapp/node_modules/lmdb/build'
npm ERR! gyp info it worked if it ends with ok
npm ERR! gyp info using node-gyp@9.4.0
npm ERR! gyp info using node@20.8.1 | linux | x64
npm ERR! gyp info find Python using Python version 3.7.3 found at "/usr/bin/python3"
npm ERR! gyp http GET https://nodejs.org/download/release/v20.8.1/node-v20.8.1-headers.tar.gz
npm ERR! gyp http 200 https://nodejs.org/download/release/v20.8.1/node-v20.8.1-headers.tar.gz
npm ERR! gyp http GET https://nodejs.org/download/release/v20.8.1/SHASUMS256.txt
npm ERR! gyp http 200 https://nodejs.org/download/release/v20.8.1/SHASUMS256.txt
npm ERR! gyp info spawn /usr/bin/python3
npm ERR! gyp info spawn args [
npm ERR! gyp info spawn args '/usr/local/lib/node_modules/npm/node_modules/node-gyp/gyp/gyp_main.py',
npm ERR! gyp info spawn args 'binding.gyp',
npm ERR! gyp info spawn args '-f',
npm ERR! gyp info spawn args 'make',
npm ERR! gyp info spawn args '-I',
npm ERR! gyp info spawn args '/user/src/myapp/node_modules/lmdb/build/config.gypi',
npm ERR! gyp info spawn args '-I',
npm ERR! gyp info spawn args '/usr/local/lib/node_modules/npm/node_modules/node-gyp/addon.gypi',
npm ERR! gyp info spawn args '-I',
npm ERR! gyp info spawn args '/root/.cache/node-gyp/20.8.1/include/node/common.gypi',
npm ERR! gyp info spawn args '-Dlibrary=shared_library',
npm ERR! gyp info spawn args '-Dvisibility=default',
npm ERR! gyp info spawn args '-Dnode_root_dir=/root/.cache/node-gyp/20.8.1',
npm ERR! gyp info spawn args '-Dnode_gyp_dir=/usr/local/lib/node_modules/npm/node_modules/node-gyp',
npm ERR! gyp info spawn args '-Dnode_lib_file=/root/.cache/node-gyp/20.8.1/<(target_arch)/node.lib',
npm ERR! gyp info spawn args '-Dmodule_root_dir=/user/src/myapp/node_modules/lmdb',
npm ERR! gyp info spawn args '-Dnode_engine=v8',
npm ERR! gyp info spawn args '--depth=.',
npm ERR! gyp info spawn args '--no-parallel',
npm ERR! gyp info spawn args '--generator-output',
npm ERR! gyp info spawn args 'build',
npm ERR! gyp info spawn args '-Goutput_dir=.'
npm ERR! gyp info spawn args ]
npm ERR! gyp info spawn make
npm ERR! gyp info spawn args [ 'BUILDTYPE=Release', '-C', 'build' ]
npm ERR! ../src/writer.cpp: In member function 'int WriteWorker::WaitForCallbacks(MDB_txn**, bool, uint32_t*)':
npm ERR! ../src/writer.cpp:126:17: error: 'MDB_TRACK_METRICS' was not declared in this scope
npm ERR! if (envFlags & MDB_TRACK_METRICS)
npm ERR! ^~~~~~~~~~~~~~~~~
npm ERR! ../src/writer.cpp:135:20: error: 'MDB_TRACK_METRICS' was not declared in this scope
npm ERR! if (envFlags & MDB_TRACK_METRICS)
npm ERR! ^~~~~~~~~~~~~~~~~
npm ERR! ../src/writer.cpp:144:17: error: 'MDB_TRACK_METRICS' was not declared in this scope
npm ERR! if (envFlags & MDB_TRACK_METRICS)
npm ERR! ^~~~~~~~~~~~~~~~~
npm ERR! ../src/writer.cpp: In static member function 'static int WriteWorker::DoWrites(MDB_txn*, EnvWrap*, uint32_t*, WriteWorker*)':
npm ERR! ../src/writer.cpp:359:19: warning: deleting 'void*' is undefined [-Wdelete-incomplete]
npm ERR! delete value.mv_data;
npm ERR! ^~~~~~~
npm ERR! ../src/writer.cpp:367:19: warning: deleting 'void*' is undefined [-Wdelete-incomplete]
npm ERR! delete value.mv_data;
npm ERR! ^~~~~~~
npm ERR! ../src/writer.cpp: In member function 'void WriteWorker::Write()':
npm ERR! ../src/writer.cpp:449:6: warning: unused variable 'retries' [-Wunused-variable]
npm ERR! int retries = 0;
npm ERR! ^~~~~~~
npm ERR! ../src/writer.cpp:450:2: warning: label 'retry' defined but not used [-Wunused-label]
npm ERR! retry:
npm ERR! ^~~~~
npm ERR! make: *** [lmdb.target.mk:159: Release/obj.target/lmdb/src/writer.o] Error 1
npm ERR! gyp ERR! build error
npm ERR! gyp ERR! stack Error: `make` failed with exit code: 2
npm ERR! gyp ERR! stack at ChildProcess.onExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:203:23)
npm ERR! gyp ERR! stack at ChildProcess.emit (node:events:514:28)
npm ERR! gyp ERR! stack at ChildProcess._handle.onexit (node:internal/child_process:294:12)
npm ERR! gyp ERR! System Linux 6.7.3
npm ERR! gyp ERR! command "/usr/local/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
npm ERR! gyp ERR! cwd /user/src/myapp/node_modules/lmdb
npm ERR! gyp ERR! node -v v20.8.1
npm ERR! gyp ERR! node-gyp -v v9.4.0
npm ERR! gyp ERR! not ok
npm ERR! A complete log of this run can be found in: /root/.npm/_logs/2024-02-14T16_55_23_605Z-debug-0.log
I guess there are some non-obvious build dependencies...
closing because it's not really related to the issue I think
I want to write a simple application with native Lightning Stream support. It basically needs additional content headers https://github.com/PowerDNS/lightningstream/blob/main/docs/schema-native.md And values that are deleted should not be deleted but have the deleted flag set (and the value set to empty).
The header seems pretty easy via custom encoders. But I suppose the devil is in the details. Like having access to the transaction ID at that time. And filtering deleted values and overwriting remove() would be required...
I haven't produced any code yet but I wonder if it makes sense to support in this library or a wrapper library?
I man implementing it in application level would be cool too. The particular application can very well be polluted with back-end specifics.