kriszyp / lmdb-js

Simple, efficient, ultra-fast, scalable data store wrapper for LMDB
Other
505 stars 41 forks source link

Strange node abort with immediate read/write #268

Closed atownley closed 9 months ago

atownley commented 9 months ago

Hi,

Since I got things migrated to lmdb and sorted out the sync vs. async transaction issue, things have been going extremely well. I'm super happy with the performance, the parallel access and everything it says on the tin.

In tracking down an issue today in another area of code, I discovered something that needed to be persisted that I'd missed.

It is a small structure, and I've tried it serialized to JSON manually and without, so this isn't the problem.

The sequence is this inside a utility function that basically increments a counter:

var val = db.get(key); // use initializer db.putSync(key, new value);

And this is also inside a larger synchronous transaction. I'm not doing any asynchronous writes or transactions at all in my code right now.

It's not crazy, and I'm doing hundreds of other reads and writes all the time. And, under current "normal" operations, I don't see the problem.

However, when I run my unit tests, they complete fine, and then I get this:

FATAL ERROR: v8::HandleScope::CreateHandle() Cannot create a handle without a HandleScope
 1: 0x1019ad495 node::Abort() [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 2: 0x1019ad581 node::OnFatalError(char const*, char const*) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 3: 0x101b7c171 v8::Utils::ReportApiFailure(char const*, char const*) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 4: 0x101d08c22 v8::internal::HandleScope::Extend(v8::internal::Isolate*) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 5: 0x101b7cde8 v8::HandleScope::CreateHandle(v8::internal::Isolate*, unsigned long) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 6: 0x101961f06 napi_get_reference_value [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 7: 0x107d9751d EnvWrap::closeEnv(bool) [/Users/XXX/XXXX/node_modules/@lmdb/lmdb-darwin-x64/node.napi.node]
 8: 0x1018f30d0 node::CleanupQueue::Drain() [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
 9: 0x101948334 node::Environment::RunCleanup() [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
10: 0x1018c53ef node::FreeEnvironment(node::Environment*) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
11: 0x1019f49b0 node::NodeMainInstance::Run() [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
12: 0x101971b33 node::Start(int, char**) [/Users/XXX/.nvm/versions/node/v21.1.0/bin/node]
13: 0x7fff20349f3d start [/usr/lib/system/libdyld.dylib]
14: 0x2 
/bin/sh: line 1: 55708 Abort trap: 6

I'm using lmdb version 2.8.5 on macOS 11.7.10.

Any idea how I can track this further? The only thing I can think is that I'm changing the same key value I read almost immediately, which is something I'm not doing elsewhere.

I'm not really sure what else to isolate, and I've triple-checked everything I can obviously see.

What would you suggest?

Cheers,

ast

kriszyp commented 9 months ago

The stack trace you posted suggests that this error is happening when a thread is terminating. Do you have other threads running in parallel that might have terminated at the same time? This code in question has actually been updated (the napi_get_reference_value call no longer exists) in 2.9, maybe try 2.9.1 and see if the error persists?

atownley commented 9 months ago

Hey. Thanks for the quick reply.

No other threads as part of the way I run the unit tests.

I'll try the upgrade and see what happens. Thus far, I'm running it in a test environment with the changes. I'm not stressing it, but it hasn't exhibited the same behavior.

Hopefully, the upgrade will fix it.

I'll let you know, but I won't get a chance to mess with it again for a few days.

Cheers,

ast

atownley commented 9 months ago

Yep. 2.9.1 solved the problem. No more crashing. Thanks!