mikeseven / node-opencl

Low-level OpenCL 1.x and 2.x bindgings for node.js
156 stars 33 forks source link

Breaking changes in Node 10 #63

Closed trxcllnt closed 6 years ago

trxcllnt commented 6 years ago

Hey Mike just a heads up Node 10 has a new version of v8 with breaking changes.

Seems like more info is available here: https://github.com/nodejs/nan/issues/289 https://github.com/kkoopa/nan/commit/c66082dee22375c65cbd2563a4457cf7fb7eeac5 https://groups.google.com/forum/#!topic/v8-users/gQVpp1HmbqM https://bugs.chromium.org/p/v8/issues/detail?id=3929

(click here to expand the list of the compiler errors) ../src/kernel.cpp:128:54: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_NUMBER("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:129:57: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_NUMBER("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:130:56: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_NUMBER("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:131:59: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_NUMBER("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:132:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_NUMBER("int", cl_int , IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:133:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_NUMBER("uint", cl_uint, IsUint32, ToUint32()->Value); ^ ../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value); ^ ../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value); ^ ../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value); ^ ../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value); ^ ../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value); ^ ../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value); ^ ../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’ CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value); ^
mikeseven commented 6 years ago

Thanks Paul... Just when I was wondering what to do this weekend ??

--mike


From: Paul Taylor notifications@github.com Sent: Thursday, April 26, 2018 7:13:23 PM To: mikeseven/node-opencl Cc: Subscribed Subject: [mikeseven/node-opencl] Breaking changes in Node 10 (#63)

Hey Mike just a heads up Node 10 has a new version of v8 with breaking changes.

Seems like more info is available here: nodejs/nan#289https://github.com/nodejs/nan/issues/289 kkoopa/nan@c66082dhttps://github.com/kkoopa/nan/commit/c66082dee22375c65cbd2563a4457cf7fb7eeac5 https://groups.google.com/forum/#!topic/v8-users/gQVpp1HmbqM https://bugs.chromium.org/p/v8/issues/detail?id=3929

(click here to expand the list of the compiler errors)

../src/kernel.cpp:128:54: error: no matching function for call to ‘v8::Value::ToInt32()’ CONVERT_NUMBER("char", cl_char, IsInt32, ToInt32()->Value); ^

../src/kernel.cpp:129:57: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_NUMBER("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:130:56: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_NUMBER("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:131:59: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_NUMBER("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:132:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_NUMBER("int", cl_int , IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:133:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_NUMBER("uint", cl_uint, IsUint32, ToUint32()->Value);

^

../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:181:53: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("char", cl_char, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:182:56: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uchar", cl_uchar, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:183:55: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("short", cl_short, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:184:58: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("ushort", cl_ushort, IsInt32, ToUint32()->Value);

^

../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:185:51: error: no matching function for call to ‘v8::Value::ToInt32()’

CONVERT_VECTS("int", cl_int, IsInt32, ToInt32()->Value);

^

../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value);

^

../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value);

^

../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value);

^

../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value);

^

../src/kernel.cpp:186:55: error: no matching function for call to ‘v8::Value::ToUint32()’

CONVERT_VECTS("uint", cl_uint, IsUint32, ToUint32()->Value);

^

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mikeseven/node-opencl/issues/63, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAxYLNkcvH6ur1DejCPEsSYzCeewhzJ-ks5tsn7DgaJpZM4TpqI-.

trxcllnt commented 6 years ago

@mikeseven Yeah, luckily I think the changes are mostly isolated. If I were more familiar with C++ macros, I'd make the changes locally and submit a PR. Also there's one I forgot to include that I fixed locally when testing, in types.cpp:

-  static const int idle_time_in_ms = 5;
-  v8::Isolate::GetCurrent()->IdleNotification(idle_time_in_ms);
+  static const double idle_time_in_ms = 5.0f;
+  v8::Isolate::GetCurrent()->IdleNotificationDeadline(idle_time_in_ms);

Seems to compile OK, but I wasn't able to test due to the errors in the OP.

On an unrelated note, I've been exploring coordinating asynchrony between JS and OpenCL events. I can nondeterministically crash node with an error in libuv about failed event loop assertions (nvidia OpenCL drivers on a GTX 1070). I know node has been changing their async resource stuff in recent years, so I can't say I'm surprised race conditions may exist at this level, but is there any way I can help get that more stable?

mikeseven commented 6 years ago

Indeed some versions of node work well async but others are problematic. It has always been very hard to track. Typically I like to rely on NAN so to abstract these details and have a more generic solution for node-opencl across nodejs versions. But it has been the case that sometimes one needs to wait until NAN is ready too.

--mike


From: Paul Taylor notifications@github.com Sent: Friday, April 27, 2018 12:01:08 PM To: mikeseven/node-opencl Cc: Bourges-sevenier, Mikael; Mention Subject: Re: [mikeseven/node-opencl] Breaking changes in Node 10 (#63)

@mikesevenhttps://github.com/mikeseven Yeah, luckily I think the changes are mostly isolated. If I were more familiar with C++ macros, I'd make the changes locally and submit a PR. Also there's one I forgot to include that I fixed locally when testing, in types.cpp:

Seems to compile OK, but I wasn't able to test due to the errors in the OP.

On an unrelated note, I've been exploring coordinating asynchrony between JS and OpenCL events. I can nondeterministically crash node with an error in libuv about failed event loop assertions (nvidia OpenCL drivers on a GTX 1070). I know node has been changing their async resource stuff in recent years, so I can't say I'm surprised race conditions may exist at this level, but is there any way I can help get that more stable?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mikeseven/node-opencl/issues/63#issuecomment-385064382, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAxYLEydpnPtKUkkTaxUoBPs4XERrEPzks5ts2rzgaJpZM4TpqI-.

trxcllnt commented 6 years ago

@mikeseven good news! I didn't realize the feature/upgrade_to_NAN_2.8.0 branch existed till today. I compiled against node 10 and it works great, so we can consider this issue resolved.

In the meantime I've been working to get a minimal repro case to demo the async issues I mentioned before, which I have up in this gist: https://gist.github.com/trxcllnt/a37bca8dd3ddd8ff99a0e39068271ad8. All it does is copy 128MB to the input buffer, execute the demo square kernel, then read back the results about 10,000 times.

The good news is that updating to Node 10 and working off the NAN_2.8 branch has resolved the worst of the crashes. I think it may have been caused by my (ab)use of custom events to coordinate both the command queue and call back into JS to resume program execution, as well as running with the --harmony-async-iteration flag. For posterity, here're the libuv error I was seeing (possibly due to an unsafe uv_queue_work call?) and the corresponding backtrace:

node: src/threadpool.c:252: uv__queue_done: Assertion `(((const QUEUE *) (&(req->loop)->active_reqs) == (const QUEUE ) ((QUEUE **) &((*(&(req->loop)->active_reqs))[0]))) == 0)' failed. * thread #1: tid = 13305, 0x00007f16afbb20bb libc.so.6`gsignal + 203, name = 'node', stop reason = signal SIGABRT * frame #0: 0x00007f16afbb20bb libc.so.6`gsignal + 203 frame #1: 0x00007f16afbb3f5d libc.so.6`abort + 365 frame #2: 0x00007f16afba9f17 libc.so.6`___lldb_unnamed_symbol70$$libc.so.6 + 295 frame #3: 0x00007f16afba9fc2 libc.so.6`__assert_fail + 66 frame #4: node`uv__queue_done(w=, err=) at threadpool.c:267 frame #5: node`uv__work_done(handle=) at threadpool.c:251 frame #6: node`uv__async_io(loop=, w=, events=) at async.c:118 frame #7: node`uv__io_poll(loop=, timeout=) at linux-core.c:400 frame #8: node`uv_run(loop=, mode=) at core.c:368 frame #9: 0x00007f16ad7cb2cd node`node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 1213 frame #10: 0x00007f16ad7c79e3 node`node::Start(int, char**) + 339 frame #11: 0x00007f16afb9c1c1 libc.so.6`__libc_start_main + 241 frame #12: 0x00007f16ad791311 node`_start + 41

One time I got lucky and memwatch-next GC'd its way to a segfault, and I got some stack traces with my actual code in it:

(llnode) v8 bt * thread #1: tid = 6313, 0x00007ffda7ad5abe node`v8::internal::IncrementalMarking::Step(unsigned long, v8::internal::IncrementalMarking::CompletionAction, v8::internal::IncrementalMarking::ForceCompletionAction, v8::internal::StepOrigin) + 350, name = 'node', stop reason = signal SIGSEGV * frame #0: 0x00007ffda7ad5abe node`v8::internal::IncrementalMarking::Step(unsigned long, v8::internal::IncrementalMarking::CompletionAction, v8::internal::IncrementalMarking::ForceCompletionAction, v8::internal::StepOrigin) + 350 frame #1: 0x00007ffda7ad77c3 node`v8::internal::IncrementalMarking::AdvanceIncrementalMarkingOnAllocation() + 819 frame #2: 0x00007ffda7ad7a1a node`v8::internal::IncrementalMarking::Observer::Step(int, unsigned char*, unsigned long) + 106 frame #3: 0x00007ffda7b1051f node`v8::internal::NewSpace::InlineAllocationStep(unsigned char*, unsigned char*, unsigned char*, unsigned long) + 127 frame #4: 0x00007ffda7b1066d node`v8::internal::NewSpace::EnsureAllocation(int, v8::internal::AllocationAlignment) + 109 frame #5: 0x00007ffda7aaa79b node`v8::internal::Heap::AllocateRaw(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) + 203 frame #6: 0x00007ffda7aae8f8 node`v8::internal::Heap::AllocateFillerObject(int, bool, v8::internal::AllocationSpace) + 24 frame #7: 0x00007ffda7a6e13d node`v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) + 45 frame #8: 0x00007ffda7ce7100 node`v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) + 96 frame #9: 0x000007e824e842fd frame #10: 0x000007e824e8c4af frame #11: 0x000007e825019b4e (anonymous)(this=0x00002f07b88866f1:) at /home/ptaylor/dev/graphistry/viz-app/packages/lib/forceatlas2/bin/async.js:122:10 fn=0x00002dd42d461d39 frame #12: 0x000007e824f3d196 runTest(this=0x00002f07b88866f1:, 0x00002f021e302311:) at /home/ptaylor/dev/graphistry/viz-app/packages/lib/forceatlas2/bin/async.js:99:34 fn=0x0000128e9805fac1 frame #13: 0x000007e824f38016 (anonymous)(this=0x00002f07b88866f1:, 0x00001596692cc631:) at (no script) fn=0x00001596692bda81 frame #14: 0x000007e824f09cfc frame #15: 0x000007e824e84239 frame #16: 0x000007e824e84101 frame #17: 0x00007ffda7a64875 node`v8::internal::Execution::TryCall(v8::internal::Isolate*, v8::internal::Handle, v8::internal::Handle, int, v8::internal::Handle*, v8::internal::Execution::MessageHandling, v8::internal::MaybeHandle*) + 325 frame #18: 0x00007ffda7b673ba node`v8::internal::Isolate::PromiseReactionJob(v8::internal::Handle, v8::internal::MaybeHandle*, v8::internal::MaybeHandle*) + 314 frame #19: 0x00007ffda7b67f51 node`v8::internal::Isolate::RunMicrotasksInternal() + 641 frame #20: 0x00007ffda7b68541 node`v8::internal::Isolate::RunMicrotasks() + 49 frame #21: 0x00007ffda75837b0 node`node::InternalCallbackScope::Close() + 144 frame #22: 0x00007ffda7584438 node`node::InternalMakeCallback(node::Environment*, v8::Local, v8::Local, int, v8::Local*, node::async_context) + 152 frame #23: 0x00007ffda75845fd node`node::MakeCallback(v8::Isolate*, v8::Local, v8::Local, int, v8::Local*, node::async_context) + 109 frame #24: 0x00007f4e946919f7 opencl.node`opencl::NoCLEventWorker::HandleOKCallback() + 599 frame #25: 0x00007f4e94691de6 opencl.node`Nan::AsyncExecuteComplete(uv_work_s*) + 486 frame #26: node`uv__work_done(handle=) at threadpool.c:251 frame #27: node`uv__async_io(loop=, w=, events=) at async.c:118 frame #28: node`uv__io_poll(loop=, timeout=) at linux-core.c:400 frame #29: node`uv_run(loop=, mode=) at core.c:368 frame #30: 0x00007ffda75902cd node`node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 1213 frame #31: 0x00007ffda758c9e3 node`node::Start(int, char**) + 339 frame #32: 0x00007f4e96af41c1 libc.so.6`__libc_start_main + 241 frame #33: 0x00007ffda7556311 node`_start + 41

Now the only issue I have left is that using the events from enqueueMapBuffer, enqueueNDRangeKernel, enqueueUnmapMemObject is significantly slower than not, but only when I run it on my NVidia card: GeForce GTX 1070 - OpenCL 1.2 CUDA 9.1.84

If you want to test it locally, flip the flag on this line. This flag will capture timings of each enqueue call.

Here's the performance of the sync version:
node bin/async.js
OpenCL 1.2 CUDA 9.1.84 - GeForce GTX 1070
== Initial loop terminated ==
iteration: 0 (async=false, dTime=5.48ms)
iteration: 525 (async=false, dTime=65.64ms)
iteration: 1050 (async=false, dTime=68.08ms)
iteration: 1575 (async=false, dTime=65.18ms)
iteration: 2100 (async=false, dTime=65.75ms)
iteration: 2625 (async=false, dTime=65.63ms)
iteration: 3150 (async=false, dTime=64.67ms)
iteration: 3675 (async=false, dTime=63.67ms)
iteration: 4200 (async=false, dTime=62.18ms)
iteration: 4725 (async=false, dTime=63.32ms)
iteration: 5250 (async=false, dTime=63.59ms)
iteration: 5775 (async=false, dTime=62.2ms)
iteration: 6300 (async=false, dTime=62.43ms)
iteration: 6825 (async=false, dTime=64ms)
iteration: 7350 (async=false, dTime=62.94ms)
iteration: 7875 (async=false, dTime=61.83ms)
iteration: 8400 (async=false, dTime=61.57ms)
iteration: 8925 (async=false, dTime=61.85ms)
iteration: 9450 (async=false, dTime=63.09ms)
result: success (total=1215.37ms)
And here it is using events:
OpenCL 1.2 CUDA 9.1.84 - GeForce GTX 1070
== Initial loop terminated ==
iteration: 0 (async=true, dTime=8.71ms)
iteration: 525 (async=true, dTime=9983.18ms)
iteration: 1050 (async=true, dTime=12283.73ms)
iteration: 1575 (async=true, dTime=10463.19ms)
iteration: 2100 (async=true, dTime=8840.24ms)
iteration: 2625 (async=true, dTime=10062.69ms)
iteration: 3150 (async=true, dTime=9913.39ms)
iteration: 3675 (async=true, dTime=10127.72ms)
iteration: 4200 (async=true, dTime=10061.24ms)
iteration: 4725 (async=true, dTime=10821.61ms)
iteration: 5250 (async=true, dTime=8161.88ms)
iteration: 5775 (async=true, dTime=10620.67ms)
iteration: 6300 (async=true, dTime=10401.11ms)
iteration: 6825 (async=true, dTime=10621.25ms)
iteration: 7350 (async=true, dTime=10524ms)
iteration: 7875 (async=true, dTime=12942.98ms)
iteration: 8400 (async=true, dTime=11702.97ms)
iteration: 8925 (async=true, dTime=11823.11ms)
iteration: 9450 (async=true, dTime=11603.92ms)
result: success (total=200209.13ms)


Running it on my CPU is a totally different story, which makes me think this might be an issue with the NVidia drivers. But as you can see below, the async version is still twice as slow as the sync version on the CPU, which makes me think I'm missing something about how the command queue schedules events?

Intel sync:
OpenCL 1.2 LINUX - Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
== Initial loop terminated ==
iteration: 0 (async=false, dTime=4.77ms)
iteration: 525 (async=false, dTime=39.88ms)
iteration: 1050 (async=false, dTime=41.26ms)
iteration: 1575 (async=false, dTime=39.18ms)
iteration: 2100 (async=false, dTime=38.74ms)
iteration: 2625 (async=false, dTime=38.02ms)
iteration: 3150 (async=false, dTime=40.39ms)
iteration: 3675 (async=false, dTime=38.71ms)
iteration: 4200 (async=false, dTime=37ms)
iteration: 4725 (async=false, dTime=36.69ms)
iteration: 5250 (async=false, dTime=36.86ms)
iteration: 5775 (async=false, dTime=36ms)
iteration: 6300 (async=false, dTime=36.82ms)
iteration: 6825 (async=false, dTime=35.89ms)
iteration: 7350 (async=false, dTime=35.92ms)
iteration: 7875 (async=false, dTime=40.04ms)
iteration: 8400 (async=false, dTime=35.97ms)
iteration: 8925 (async=false, dTime=37.06ms)
iteration: 9450 (async=false, dTime=35.48ms)
result: success (total=720.16ms)
Intel async:
OpenCL 1.2 LINUX - Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
== Initial loop terminated ==
iteration: 0 (async=true, dTime=5.91ms)
iteration: 525 (async=true, dTime=104.97ms)
iteration: 1050 (async=true, dTime=122.22ms)
iteration: 1575 (async=true, dTime=75.72ms)
iteration: 2100 (async=true, dTime=82.1ms)
iteration: 2625 (async=true, dTime=73.5ms)
iteration: 3150 (async=true, dTime=73.2ms)
iteration: 3675 (async=true, dTime=75.29ms)
iteration: 4200 (async=true, dTime=69.19ms)
iteration: 4725 (async=true, dTime=72.36ms)
iteration: 5250 (async=true, dTime=84.45ms)
iteration: 5775 (async=true, dTime=73.96ms)
iteration: 6300 (async=true, dTime=78.74ms)
iteration: 6825 (async=true, dTime=82.75ms)
iteration: 7350 (async=true, dTime=74.83ms)
iteration: 7875 (async=true, dTime=77.44ms)
iteration: 8400 (async=true, dTime=83.01ms)
iteration: 8925 (async=true, dTime=68.65ms)
iteration: 9450 (async=true, dTime=68.13ms)
result: success (total=1513.96ms)
trxcllnt commented 6 years ago

@mikeseven oh and I forgot to mention, there is this one strange bit where I have to insert a setTimeout delay before decrementing the refCount for an event, otherwise the program exits: https://gist.github.com/trxcllnt/a37bca8dd3ddd8ff99a0e39068271ad8#file-node-opencl-async-test-js-L165

Have you ever run into anything like that?

mikeseven commented 6 years ago

Thanks Paul for this detailed report. As a matter of fact, yes I have encountered this performance issue many times. It is what I meant by very complex to debug because it has to do sometimes with v8, nan, uv, or opencl drivers. I typically run the same program in c to rule out driver issues, then start mocking everything in js. One by one trying to pinpoint the problem.

I did encounter the issue sometimes where slowing down event flow works better. This made me think about some garbage collection issue with v8.

The unmap mem objects is indeed slower in some implementations. It would seem to be a regression issue in some new release. And in this case, it is preferable to use the previous release of the driver. @@sad@@

--mike


From: Paul Taylor notifications@github.com Sent: Saturday, April 28, 2018 4:49:17 PM To: mikeseven/node-opencl Cc: Bourges-sevenier, Mikael; Mention Subject: Re: [mikeseven/node-opencl] Breaking changes in Node 10 (#63)

@mikesevenhttps://github.com/mikeseven oh and I forgot to mention, there is this one strange bit where I have to insert a setTimeout delay before decrementing the refCount for an event, otherwise the program exits: https://gist.github.com/trxcllnt/a37bca8dd3ddd8ff99a0e39068271ad8#file-node-opencl-async-test-js-L165

Have you ever run into anything like that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mikeseven/node-opencl/issues/63#issuecomment-385213133, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAxYLL6TxBP3aXhQvmgWK23UFMDV6xTtks5ttP_9gaJpZM4TpqI-.