josephg / node-foundationdb

Modern Node.js FoundationDB bindings
Other
115 stars 17 forks source link

SIGABRT #33

Open cisaacson opened 4 years ago

cisaacson commented 4 years ago

I have a test that implements 2 transactions in sequence, each with a Tuple key (of buffers) and a Value that is a buffer. Each transaction only does a set. The first transaction succeeds, and the second fails with the trace below. The trace only shows up when we try and debug the code, otherwise it just hangs on the second transaction. It seems as if the commit is not occurring, yet that should be automatic. We are running Node 10.16.0, Typescript 3.6, compiling to es5. Any ideas what could cause this? What is odd is that I have other tests that do multiple writes and they all work fine.

I tried this with tn.set inside the transaction, and also with await db.set as the convenience method. Both hang in the same way.

Any help appreciated.

/usr/local/opt/node@10/bin/node[92513]: ../src/callbackscope.cc:110:void node::InternalCallbackScope::Close(): Assertion `(env->execution_async_id()) == (0)' failed. 1: 0x10003b540 node::Abort() [/usr/local/opt/node@10/bin/node] 2: 0x10003a638 node::AddEnvironmentCleanupHook(v8::Isolate, void ()(void), void) [/usr/local/opt/node@10/bin/node] 3: 0x100012820 node::InternalCallbackScope::Close() [/usr/local/opt/node@10/bin/node] 4: 0x1000122f0 node::InternalCallbackScope::~InternalCallbackScope() [/usr/local/opt/node@10/bin/node] 5: 0x1000a50fa node::PerIsolatePlatformData::RunForegroundTask(std::__1::unique_ptr<v8::Task, std::1::default_delete >) [/usr/local/opt/node@10/bin/node] 6: 0x1000a46f3 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [/usr/local/opt/node@10/bin/node] 7: 0x1000a553f node::NodePlatform::FlushForegroundTasks(v8::Isolate*) [/usr/local/opt/node@10/bin/node] 8: 0x1000eeda2 node::inspector::NodeInspectorClient::runMessageLoop() [/usr/local/opt/node@10/bin/node] 9: 0x10014f9b4 v8_inspector::V8Debugger::handleProgramBreak(v8::Local, v8::Local, std::1::vector<int, std::__1::allocator > const&, bool, bool) [/usr/local/opt/node@10/bin/node] 10: 0x1003afc36 v8::internal::Debug::OnDebugBreak(v8::internal::Handle) [/usr/local/opt/node@10/bin/node] 11: 0x1003af3c6 v8::internal::Debug::Break(v8::internal::JavaScriptFrame*, v8::internal::Handle) [/usr/local/opt/node@10/bin/node] 12: 0x1005cf440 v8::internal::Runtime_DebugBreakOnBytecode(int, v8::internal::Object*, v8::internal::Isolate) [/usr/local/opt/node@10/bin/node] 13: 0xd20a74dc69d 14: 0xd20a74e1de5 15: 0xd20a74918d5 16: 0xd20a79bf1f2

cisaacson commented 4 years ago

In testing this more, the stack trace only happens in the debugger (there are other similar issues re: nodejs that are documented for other projects).

However, by using logging (without the debugger) it just hangs after the set is complete, it never exits the transaction block. When using db.set it never returns on the second iteration.

Here is the code for completeness:

        const txResult = await this.db.doTransaction(async tn => {
          logger.debug(`before tn.set. id: ${id}`);
          tn.set([prefixbuf1, prefixbuf2, keyBuffer], valueBuffer);
          logger.debug(`after tn.set.  id: ${id}`);
          return true;
        });

The above is called in a loop with 2 id values, fails on the second iteration. With just one iteration it works fine.

cisaacson commented 4 years ago

This is some sort of timing issue, I am not sure why.

I added the following after each iteration:

await this.delay(1);

And it works perfectly on every test.

Any ideas? I don't see what could be changed in the code to avoid it, seems like something with the Nodejs V8 event loop perhaps.

josephg commented 4 years ago

Interesting! I’ll take a look in the next few days and see if I can reproduce this locally. Thanks for the detailed report!

It certainly looks like a bug in something...

What version of foundationdb are you using, and on what OS?

josephg commented 4 years ago

Also can you run this command for me and tell me the output?

node -e 'console.log(require("foundationdb").modType)'

I expect you're using napi but I'd like to check.

cisaacson commented 4 years ago

I am running foundationdb: v6.1.12 My OS is: MacOS Here is the output, it looks like 'nan':

node -e 'console.log(require("foundationdb").modType)' nan

I do think that when the transaction is run in a loop, you are not awaiting the commit properly, that seems like the issue. It could be that this is always happening, but it is only noticeable in that case.

Here is a more complete version of the code that causes it:

for (const id of idList) {
       const txResult = await this.db.doTransaction(async tn => {
          logger.debug(`before tn.set. id: ${id}`);
          tn.set([prefixbuf1, prefixbuf2, keyBuffer], valueBuffer);
          logger.debug(`after tn.set.  id: ${id}`);
          return true;
        });
}

The second await never returns, but if I delay as above, then it works.

Hope that helps.

josephg commented 4 years ago

I'm failing to reproduce this issue. My script is below. Can you try running that and see if this works?

const fdb = require('foundationdb')
fdb.setAPIVersion(600)

const db = fdb.openSync()
  .at('test.')
  .withKeyEncoding(fdb.encoders.tuple)

const prefixbuf1 = Buffer.from('prefix 1')
const prefixbuf2 = Buffer.from('prefix 2')
const keyBuffer = Buffer.from('key')
const valueBuffer = Buffer.from('oh hai')

const idList = [1,2,3,4,5].map(id => `${id}`)

;(async () => {
  for (const id of idList) {
    const txResult = await db.doTransaction(async tn => {
      console.log(`before tn.set. id: ${id}`);
      //const keyBuffer = Buffer.from(id)
      tn.set([prefixbuf1, prefixbuf2, keyBuffer], valueBuffer);
      console.log(`after tn.set.  id: ${id}`);
      return true;
    });
  }

  console.log('ok done')
})()

This works just fine, either setting the same key each time or not, and compiled (using tsc) to es5 or run directly.

$ sw_vers 
ProductName:    Mac OS X
ProductVersion: 10.14.6
BuildVersion:   18G95
$ node --version
v10.16.0
$ fdbcli --version
FoundationDB CLI 6.1 (v6.1.12)
source version 90ba203c166c62ae9173b2523081a0c6afaa1604
protocol fdb00b061060001
cisaacson commented 4 years ago

Thanks for looking into this for me. Your test works fine for me too when I run it. But I do think I have narrowed down the issue.

I have narrowed it down using the debugger to exactly the line that causes the failure.

        const txResult = await this.db.doTransaction(async tn => {
          logger.debug(`before tn.set. locationId: ${locationId}`);
          tn.set([dataKeyBuffer], dataValueBuffer);
          logger.debug(`after tn.set.  locationId: ${locationId}`);
          return true; // When you step through or over this line the stack trace occurs
        });

        // await this.delay(1); // With this in place it does not occur

When it hits return true the first time through, the debugger crashes with the stack below. Without the debugger it just hangs, and never exits the tx. If I uncomment the delay then it runs fine in the debugger (which is also odd as that is after the tx block. Your test fails in the debugger too.

My test is running from mocha, and I the test function is prefixed with async () => {}. When I run your test there is no error, but as above, when I debug your test, it fails at exactly the same place and crashes the debugger with the stack trace.

The hang occurs in exactly the same place, but that only happens when I call another method first which creates a stream with promisePipe that uses another libary that also has low-level code. If I comment that out I get it to run correctly, but the debugger still fails with the stack trace.

Therefore, I do think something is wrong at the C code level (but hard to know what that could be), because of the debug stack trace. I suspect if you can find that, it should be OK with my other library.

I hope this helps to track it down. One reason I did all these variations is that I wanted to be sure it wasn't caused by one of the other libraries I am using was not causing the issue (I had wierd things with a version of that earlier).

Here is the mocha code for your test, try debug on this (I use WebStorm, but it should be the standard debugger regardless):

it.only(`should run foundationdb test with loop from Seph`, async () => {
    const fdb = require('foundationdb');
    fdb.setAPIVersion(600);

    const db = fdb.openSync()
      .at('test.')
      .withKeyEncoding(fdb.encoders.tuple);

    const prefixbuf1 = Buffer.from('prefix 1');
    const prefixbuf2 = Buffer.from('prefix 2');
    const keyBuffer = Buffer.from('key');
    const valueBuffer = Buffer.from('oh hai');

    const idList = [1,2,3,4,5].map(id => `${id}`);

    // (async () => {
      for (const id of idList) {
        logger.debug(`id: ${id}`);
        const txResult = await db.doTransaction(async tn => {
          console.log(`before tn.set. id: ${id}`);
          const keyBuffer = Buffer.from(id);
          tn.set([prefixbuf1, prefixbuf2, keyBuffer], valueBuffer);
          console.log(`after tn.set.  id: ${id}`);
          return true; // Debugger crashes here
        });
      }

      console.log('ok done')
    // })();

  });

Debug stack trace:

/usr/local/opt/node@10/bin/node[34012]: ../src/callback_scope.cc:110:void node::InternalCallbackScope::Close(): Assertion `(env_->execution_async_id()) == (0)' failed.
 1: 0x10003b540 node::Abort() [/usr/local/opt/node@10/bin/node]
 2: 0x10003a638 node::AddEnvironmentCleanupHook(v8::Isolate*, void (*)(void*), void*) [/usr/local/opt/node@10/bin/node]
 3: 0x100012820 node::InternalCallbackScope::Close() [/usr/local/opt/node@10/bin/node]
 4: 0x1000122f0 node::InternalCallbackScope::~InternalCallbackScope() [/usr/local/opt/node@10/bin/node]
 5: 0x1000a50fa node::PerIsolatePlatformData::RunForegroundTask(std::__1::unique_ptr<v8::Task, std::__1::default_delete<v8::Task> >) [/usr/local/opt/node@10/bin/node]
 6: 0x1000a46f3 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [/usr/local/opt/node@10/bin/node]
 7: 0x1000a553f node::NodePlatform::FlushForegroundTasks(v8::Isolate*) [/usr/local/opt/node@10/bin/node]
 8: 0x1000eeda2 node::inspector::NodeInspectorClient::runMessageLoop() [/usr/local/opt/node@10/bin/node]
 9: 0x10014f9b4 v8_inspector::V8Debugger::handleProgramBreak(v8::Local<v8::Context>, v8::Local<v8::Value>, std::__1::vector<int, std::__1::allocator<int> > const&, bool, bool) [/usr/local/opt/node@10/bin/node]
10: 0x1003afc36 v8::internal::Debug::OnDebugBreak(v8::internal::Handle<v8::internal::FixedArray>) [/usr/local/opt/node@10/bin/node]
11: 0x1003af3c6 v8::internal::Debug::Break(v8::internal::JavaScriptFrame*, v8::internal::Handle<v8::internal::JSFunction>) [/usr/local/opt/node@10/bin/node]
12: 0x1005cf440 v8::internal::Runtime_DebugBreakOnBytecode(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/opt/node@10/bin/node]
13: 0x83f6425c69d

Hope that helps!

josephg commented 4 years ago

I still can't reproduce your issue. It runs fine for me, though I'm running it directly from the terminal with npx mocha test2.js -- --inspect. Can you make a reproducing test case I can run in my terminal without webstorm?

That stack trace is a crash inside the debugger. It might run better if you run the napi codepath instead. Can you try with node 11 or node 12?

josephg commented 2 years ago

Hi! Is this still an issue? Some bugs have been fixed since the heady, covid-free days of Sept 2019. I'll close the issue if not.