Closed Rafael-006 closed 2 years ago
Just to check if my understanding of the flow is correct:
Could you please confirm which of two connection handles triggers the segfault?
And could you please record and attach here the detailed log, from initial call to segfault, by setting the trace level '3' in the DEFAULT section of sapnwrfc.ini
. Example:
sapnwrfc.ini
DEFAULT
RFC_TRACE=3 # 'global' trace level
DEST=QM7
USER=demo
PASSWD=welcome
ASHOST=ldciqm7
SYSNR=00
CLIENT=005
LANG=EN
# RFC_TRACE=2 # 'local' trace level
Just to check if my understanding of the flow is correct:
- 139949697433968 & 139949965870272 are acquired and initial call with 139949697433968 completed regularly
- Network disconnected
- Call with 139949965870272 ends with cancel timeout
- Call with 139949697433968 ends with segfault?
Could you please confirm which of two connection handles triggers the segfault?
And could you please record and attach here the detailed log, from initial call to segfault, by setting the trace level '3' in the DEFAULT section of
sapnwrfc.ini
. Example:sapnwrfc.ini
DEFAULT RFC_TRACE=3 # 'global' trace level DEST=QM7 USER=demo PASSWD=welcome ASHOST=ldciqm7 SYSNR=00 CLIENT=005 LANG=EN # RFC_TRACE=2 # 'local' trace level
Just to check if my understanding of the flow is correct:
- 139949697433968 & 139949965870272 are acquired and initial call with 139949697433968 completed regularly
- Network disconnected
- Call with 139949965870272 ends with cancel timeout
- Call with 139949697433968 ends with segfault?
Could you please confirm which of two connection handles triggers the segfault?
And could you please record and attach here the detailed log, from initial call to segfault, by setting the trace level '3' in the DEFAULT section of
sapnwrfc.ini
. Example:sapnwrfc.ini
DEFAULT RFC_TRACE=3 # 'global' trace level DEST=QM7 USER=demo PASSWD=welcome ASHOST=ldciqm7 SYSNR=00 CLIENT=005 LANG=EN # RFC_TRACE=2 # 'local' trace level
Here is the log at trace level 3
Could you please attach complete log files or email to me?
Can you figure out if segfault occurs for the connection handle from initial call or another one?
Could you please attach complete log files or email to me?
Can you figure out if segfault occurs for the connection handle from initial call or another one?
Here is a scenario when we try to do cancelClient, we get error. Kindly find the log attached,
FATAL ERROR: HandleScope::HandleScope Entering the V8 API without proper locking in place 1: 0xa222f0 node::Abort() [node] 2: 0x96411f node::FatalError(char const*, char const*) [node] 3: 0xb97c3a v8::Utils::ReportApiFailure(char const*, char const*) [node] 4: 0xb9970c v8::EscapableHandleScope::EscapableHandleScope(v8::Isolate*) [node] 5: 0x9dfeb6 napi_open_escapable_handle_scope [node] 6: 0x7f134b3b9934 node_rfc::rfcSdkError(_RFC_ERROR_INFO*, node_rfc::_RfmErrorPath*) [/home/project/node_modules/node-rfc/lib/binding/sapnwrfc.node] 7: 0x7f134b3d4cf4 node_rfc::Cancel(Napi::CallbackInfo const&) [/home/project/node_modules/node-rfc/lib/binding/sapnwrfc.node] 8: 0x7f134b3d6bf2 Napi::details::CallbackData<Napi::Value (*)(Napi::CallbackInfo const&), Napi::Value>::Wrapper(napi_env__*, napi_callback_info__*) [/home/project/node_modules/node-rfc/lib/binding/sapnwrfc.node] 9: 0x9d7e5f [node] 10: 0xc03d0b [node] 11: 0xc052b6 [node] {"level":"info","time":"Wed Sep 08 2021 12:53:15 GMT+0000","pid":61122,"hostname":"admin","name":"Server","reqId":"req-2","res":{"url":"/v1/login","method":"POST","statusCode":200},"responseTime":14798.31078099832,"msg":"request completed"} 12: 0xc05936 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node] 13: 0x1423359 [node] Aborted (core dumped)
Sorry, I meant the RFC trace log, captured in *.trc files
Could you please send the source script to reproduce the issue?
Fun fact, I have the same issue but wrapping the code inside addon.cc:Cancel
inside a mutex prevents this from happening (on the other hand it generates a segfault for a different reason).
@bsrdjan quick question re this topic - is it expected behavior that RegisterModule
is called more than once? With the help of gdb I have determined that when cancelling a client it is in fact called twice (if there are two cancellations)
Upon further inspection I'm either missing something or some very illegal behavior is happening. As far as I understand a worker thread is created in terminate
to spawn cancel.ts
which then requires the addon bindings.. This is why there is a second call to RegisterModule
..
But why do we need to spawn a worker thread to cancel something? Can't it be done on the same thread? Just calling cancel directly?
Indeed, cancelling the client in the main thread with a simple call to noderfc_binding.cancel
completely resolves this issue for me.
EDIT: Just to elaborate on why I called the behavior illegal, in node-rfc
the environment is cached (node_rfc::__env
) and the scope inside rfcSdkError
which is invoked during a call to Cancel
uses this cached environment. However, when called from a worker thread the environment is not the main caller's environment, hence a segfault may occur since you are trying to create an EscapableScope with the environment node_rfc::__env
in a worker thread environment.
(As a side note it is generally not recommended to cache the environment, though I can understand doing it in node-rfc as libsapnwrfc generally does not operate well with multiple programs / independent threads using it)
But why do we need to spawn a worker thread to cancel something? Can't it be done on the same thread? Just calling cancel directly?
From SAP NWRFC SDK doxygen documentation of RfcCancel
method:
Cancels the RFC call which is currently being called over the given RFC connection and closes the connection. Can be used only on an RFC client connection and needs to be called from a different thread than the one currently executing the RFC call.
It is definitely a bug, need to investigate a clean solution here.
Ok, but then you can spawn a simple thread for that purpose alone without inuring the entire separate environment of a worker thread, no? Just using regular pthreads (or in this case uv
since it fits nicely with node stuff anyway). So during cancel you simply spawn a new C++ thread (no workers), then set up a conditional variable or whatever your favorite mutex strategy is wait for the result.
Does that sound cleaner? I personally would certainly choose to go that route
@samuel-c-allan, yes it sound good and shall be tested. Would you like to create a PR?
@bsrdjan I would love to :) Let me code up something, submit a PR and then you can check whether it is presentable
@bsrdjan Quick question - are you fine if we go with std::thread
instead? While I'm able to get an example working with uv, it involves C-style function pointers and while I am very comfortable working with them it kind of goes against the whole higher-level C++ style of the project. If you don't particularly care about it I'll go with the approach I'm using right now but if you prefer more C++ style code I'll switch to std::thread
Describe the bug I am using connection pool to acquire, call (With timeout) and release connection. On timeout, I receive the event sapnwrfc:clientCancel.
I get a Segmentation fault (core dumped) error and my node application crashes which needs a restart to work properly.
To Reproduce Version of node-rfc - 2.5.1 installed locally
To reproduce the issue,
poolOptions: { low: 2, high: 10 }
Connection handle details are as follows,
Initial call connection handle - 139949697433968
Other two API call connection handle are - 139949697433968 & 139949965870272