nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
105.51k stars 28.62k forks source link

Expose system errors #53354

Open nbabanov opened 1 month ago

nbabanov commented 1 month ago

What is the problem this feature will solve?

Currently we have UNKNOWN write errors happening from time to time in our webpack based build system.

Example error for v18.17.1:

Error: write UNKNOWN
    at ChildProcess.target._send (node:internal/child_process:865:20)
    at ChildProcess.target.send (node:internal/child_process:738:19)

...

Emitted 'error' event on ChildProcess instance at:
    at node:internal/child_process:869:39
    at processTicksAndRejections (node:internal/process/task_queues:77:11) {
  errno: -4094,
  code: 'UNKNOWN',
  syscall: 'write'
}

After digging in the nodejs and libuv code it turned out to be a loss of information problem. Due to the loss of error information on Windows, we cannot properly debug this and are wasting pipeline time for some flaky issue.

What is the feature you are proposing to solve the problem?

When reporting system errors from libuv, please also include in a field the actual system error.

There is already a libuv api for this. uv_fs_get_system_error

All of this is already mentioned here: https://github.com/libuv/libuv/issues/2348

What alternatives have you considered?

Recompiling libuv and nodejs just to log the issue. I would probably do it, but still it would be nice to have it.

joyeecheung commented 1 month ago

That seems to be a problem with libuv not knowing how to describe that error already. uv_fs_get_system_error wouldn't help either because it returns an integer (what you are seeing as errno: -4094). Node.js already uses UV_ERRNO_MAP to map all known system error names and attach them in the errors emitted. If you are seeing UNKONWN, it's because libuv doesn't know what the human readable description of that error code is either (perhaps because it's not a error that's meaningful cross-platform). I think one solution to provide more information about this would be to use FormatMessage() when libuv doesn't have pre-baked error message strings for it, though I don't have a Windows device to verify. cc @nodejs/platform-windows

huseyinacacak-janea commented 1 month ago

This is the error caused by the libuv which doesn't return the actual error code to the Node.js.

I'm able to reproduce the issue using the example provided in the reference issue. I plan to fix it by adding the error code to uv_translate_sys_error to give a more meaningful error message. In order to make sure I cover your error, is it possible to provide a minimal example to reproduce your issue?

nbabanov commented 1 month ago

@huseyinacacak-janea

Yep, you are correct. I have build my custom build of Node and the actual error on windows was:

// MessageId: ERROR_NO_SYSTEM_RESOURCES
//
// MessageText:
//
// Insufficient system resources exist to complete the requested service.
//
#define ERROR_NO_SYSTEM_RESOURCES        1450L

Maybe if you run a VM and starve the VM out of RAM?

huseyinacacak-janea commented 1 month ago

@nbabanov Thank you for sharing the error. I am attempting to pinpoint the specific libuv function and Windows API that are causing failures. To reproduce the issue, I’ve reduced the RAM on my virtual machine and tested with several examples, but I haven’t been able to reproduce the error you’ve described.

Would you be able to share a Node.js sample that demonstrates the problem?

nbabanov commented 1 month ago

@huseyinacacak-janea I will try to create one in two days time.

huseyinacacak-janea commented 3 weeks ago

Hey @nbabanov, is there any update on this issue?

huseyinacacak-janea commented 5 days ago

I opened a PR for the error type ERROR_BAD_EXE_FORMAT and it landed. PR: https://github.com/libuv/libuv/pull/4445

If you can give me more information on this issue, I can open a new PR for the error you are experiencing.