siemens / meta-iot2050

SIMATIC IOT2050 Isar/Debian Board Support Package
MIT License
131 stars 77 forks source link

node-red: ./src/threadpool.c:329: uv__queue_done: Assertion `uv__has_active_reqs(req->loop)' failed #386

Closed bergmanu closed 1 year ago

bergmanu commented 1 year ago

A customer in the forum reported that his node-red application crashes periodically when he reads four inputs in a 7.5Hz period. The crash reason is: Assertion 'uv__has_active_reqs(req->loop)' failed

I reproduced that issue with a simplified flow (see attachment): Only having three digital inputs connected and provide a "blink" signal in a 7.5Hz period by DQs of a S7-1500 PLC to each of them simultaneously. (Mostly) As soon as the blink application starts, node-red crashes with the following statement:

Jul 13 18:19:50 iot2050-debian node-red[818]: node-red: ./src/threadpool.c:329: uv__queue_done: Assertion `uv__has_active_reqs(req->loop)' failed
Jul 13 18:19:50 iot2050-debian systemd[1]: node-red.service: Main process exited, code=killed, status=6/ABRT
Jul 13 18:19:50 iot2050-debian systemd[1]: node-red.service: Failed with result 'signal'.
Jul 13 18:19:50 iot2050-debian systemd[1]: node-red.service: Consumed 29.504s CPU time.

Changing the period to 1Hz does not change the behavior.

UPDATE: Behavior only happens when min. 2 DIs are provided with the signal simultaneously. When providing only one DI with a signal, it works as expected. As soon as the signal is provided to a second input, the issue happens.

Hardware / Software used:

jan-kiszka commented 1 year ago

Reproduced, also with the board generating the signals itself (I don't have a PLC around). Seems there is some race condition, hopefully in mraa (which is easier to fix) and not in libuv, the nodejs event helper library. Needs more debugging.

BTW, it seems the issue was also reported to Node-RED before, see https://discourse.nodered.org/t/assertion-uv-has-active-reqs-req-loop-failed/68564.

jan-kiszka commented 1 year ago

See https://github.com/libuv/libuv/issues/3846 for a first attempt to fix it. Looks good here with 50 Hz self-trigger on two inputs which crashes immediately otherwise.

jan-kiszka commented 1 year ago

A (likely) more correct fix is under testing now, see https://github.com/siemens/mraa/commit/b10341fae399f8c25887c348af9577640b1e233e.

bergmanu commented 1 year ago

Can confirm that the image containing this fix does not have this behavior anymore. Tested the flow with 3 connected DIs (7,5Hz) without any error for 2h.