Closed jan-tosovsky-cz closed 1 month ago
Define "crashes"? I'd like to see the stack trace or whatever you get.
Node.js exits during executing this line:
await finished(Readable.fromWeb(response.body).pipe(fileStream))
No exception is thrown, hence it is not caught.
When trying to capture the error this way, that err.out
stays empty:
node script.js 1> log.out 2> err.out
However, when looking into Windows Event Viewer, no crash was reported for Node.js, so it may not be a typical app crash.
So I think that site has something unexpected (SSL certificate, server settings, ...) which triggers an exception not handled correctly and node.js exits abruptly.
Is there any option to enable some node.js logging to inspect what is going on?
Ah, okay. I think your report is a duplicate of #48466 and one or two other similar reports (that I can't find right now but I remember commenting on them. They're in the bug tracker... somewhere.)
If you agree it's the same bug, then please go ahead and close this issue.
The linked issue, if I understand correctly, causes an exiting the Node.js when the stream is consumed, but it is interrupted before calling close() method. Ok, this should be fixed. But there is another issue here. Why that stream interruption is not detected and the corresponding exception thrown, so it could be processed in my catch block?
When accessing different URLs I can see many errors (certificate problems, timeouts) so I would expect to handle such interruption events in the same way.
From the linked issue I deduce here the server closes the connection before sending the response. If so, it should be handled gracefully. But I can be wrong. I am not an experienced Node.js user to be able to dig into it more deeply.
Why that stream interruption is not detected and the corresponding exception thrown, so it could be processed in my catch block?
Most likely because the promise returned by finished
is never rejected (nor resolved), instead the process exits because nothing is keeping it alive. If you were using ESM (a .mjs
file), you could use top-level await, and you would see an exit code of 13.
I've rename the issue because there's no crash.
@nodejs/whatwg-stream can you comment wether the current behavior is expected or it should be fixed?
I'm not sure what is the actual issue. The following code works correctly on my machine (it prints finished
):
const fs = require("fs")
const path = require("path")
const { Readable } = require("stream")
const { finished } = require("stream/promises")
async function saveURLToFile(url, path) {
try {
const response = await fetch(url)
const fileStream = fs.createWriteStream(path)
await finished(Readable.fromWeb(response.body).pipe(fileStream))
console.log('finished')
} catch (err) {
console.log(err)
}
}
saveURLToFile("http://www.milicovice.cz/", path.resolve(__dirname, "output.html"))
@nodejs/undici I think I saw this problem already but I can't find an issue/pr.
@mcollina That code still doesn't work on my laptop in Nodejs v18.17.1 on Win64 Pro (10.0.22621 Build 22621). But now this stacktrace is printed:
C:\Program Files\nodejs\node.exe .\save-url-to-file.js
Uncaught TypeError TypeError: terminated
at onAborted (internal/deps/undici/undici:11442:53)
at emit (events:526:35)
at terminate (internal/deps/undici/undici:10695:14)
at fetchParams.controller.resume (internal/deps/undici/undici:11421:36)
at processTicksAndRejections (internal/process/task_queues:95:5)
--- TickObject ---
at init (internal/inspector_async_hook:25:19)
at emitInitNative (internal/async_hooks:202:43)
at emitInitScript (internal/async_hooks:504:3)
at nextTick (internal/process/task_queues:132:5)
at onDestroy (internal/streams/destroy:103:15)
at done (internal/webstreams/adapters:526:11)
at destroy (internal/webstreams/adapters:544:7)
at _destroy (internal/streams/destroy:109:10)
at destroy (internal/streams/destroy:71:5)
at destroyer (internal/streams/destroy:319:12)
at <anonymous> (internal/webstreams/adapters:555:7)
at processTicksAndRejections (internal/process/task_queues:95:5)
--- Promise.then ---
at newStreamReadableFromReadableStream (internal/webstreams/adapters:548:3)
at Readable.fromWeb (internal/streams/readable:1403:27)
at saveURLToFile (c:\test\save-url-to-file.js:10:33)
at processTicksAndRejections (internal/process/task_queues:95:5)
--- await ---
at <anonymous> (c:\test\save-url-to-file.js:17:1)
at Module._compile (internal/modules/cjs/loader:1256:14)
at Module._extensions..js (internal/modules/cjs/loader:1310:10)
at Module.load (internal/modules/cjs/loader:1119:32)
at Module._load (internal/modules/cjs/loader:960:12)
at executeUserEntryPoint (internal/modules/run_main:81:12)
at <anonymous> (internal/main/run_main_module:23:47)
Process exited with code 1
Exactly the same stracktrace I am getting also when saving the content for http://www.hornimarsov.cz/ so this issue is not isolated for the above single server URL.
Something that would be really useful to debug this is to have a complete reproduction, i.e. a local Node.js server that can trigger this problem in the client.
The bug might even be Windows specific.
https://github.com/orgs/nodejs/teams/undici I think I saw this problem already but I can't find an issue/pr.
Couldn't find it, the closest one was https://github.com/nodejs/undici/issues/2026
FWIW when I run the snippet in https://github.com/nodejs/node/issues/49233#issuecomment-1704984163 using the latest main
, I am getting exit code 13 and 'finished'
is never logged. Or sometimes I'm getting an error, it seems to be flaky:
node:events:492
throw er; // Unhandled 'error' event
^
TypeError: terminated
at Fetch.onAborted (node:internal/deps/undici/undici:11463:53)
at Fetch.emit (node:events:526:35)
at Fetch.terminate (node:internal/deps/undici/undici:10717:14)
at fetchParams.controller.resume (node:internal/deps/undici/undici:11442:36)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Emitted 'error' event on Readable instance at:
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
[cause]: Error: incorrect header check
at Zlib.zlibOnError [as onerror] (node:zlib:189:17) {
errno: -3,
code: 'Z_DATA_ERROR'
}
}
Node.js v21.0.0-pre
Funny insight: right now I can reproduce your results @aduh95.
My understanding is that our decompression algorithm is failing to decompress that payload: this is likely some kind of incompatibility with the meaning of deflate
compression.
Then because promises are resolved in the next tick and there is nothing to keep that event loop busy, the value or error are never emitted.
Wappalyzer plugin + Dev Tools reports this set of server details
http://www.milicovice.cz/ Kentico CMS (written in C#) powered by ASP.NET 4.0.30319 running on IIS 10.0 on Windows Server. The response is compressed using deflate. Keep-alive is not specified.
http://www.hornimarsov.cz/ PHP 5.5.38 running on Apache, encoded using gzip. Keep-alive timeout=5, max=100.
Each server's root cause will most likely differ as server configurations differ significantly.
Something about compression on IIS is here https://learn.microsoft.com/en-us/iis/extensions/iis-compression/iis-compression-overview There is stated two libs are available for the deflate compression, gzip.dll and iiszlib.dll. The second is not OOTB, but is more efficient and has better dynamic compression support. But it could explain incompatibilities.
I think this is due to https://github.com/nodejs/undici/issues/2128.
I've investigated this issue with Node v18.20.2, v20.16.0 and v22.9.0. Both v20 and v22 worked as expected. @jan-tosovsky-cz Do you think this can be closed now?
Per the comment above. See https://github.com/nodejs/undici/pull/2650
Thanks a lot for managing the fix. I downloaded 20.17.0 LTS (Win MSI installer) and ran the script again. The task is now completed without any issues.
Making this functionality more robust (i.e. supporting even obscure old sites) will eliminate headaches for many future developers. Kudos to all!
Version
v18.17.1
Platform
Win64
Subsystem
No response
What steps will reproduce the bug?
Run this script:
How often does it reproduce? Is there a required condition?
Always
What is the expected behavior? Why is that the expected behavior?
The content (HTML page) is stored to the file.
What do you see instead?
The node crashes.
Additional information
This method works for other sites. It is unclear why this one is problematic. That error is apparently uncaught by the catch block. This site belongs to the municipality of a small village. Please minimize the traffic (during testing) to a minimum. Thanks.