Closed mikicho closed 1 month ago
Thanks, @mikicho!
I believe the answers to both of your questions are in setting up an HTTP server and taking a look what happens to compressed bodies sent from the server. We can request the same endpoint in the browser and in Node.js, and see whether they handle that at all.
But overall, yes, Interceptors ships browser interceptors, so we can't run things like zlib
in FetchInterceptor
. The mental model behind the request listeners (and MSW handlers by extension) is to act as a server route handler. If you need to pull an extra lib to decompress the request body in real life, you need to do the same with Interceptors. Since we are not a convenience library, we don't need to worry about anything outside of that.
Some related stuff I found:
Gave this a try in Node.js:
import fs from 'fs'
import http from 'http'
import zlib from 'zlib'
const stream = fs.createReadStream('./payload.txt')
const server = http.createServer((req, res) => {
// Compress the response body.
if (req.method === 'GET' && req.url === '/resource') {
res.statusCode = 200
res.writeHead(200, {
'content-type': 'text/plain',
'content-encoding': 'gzip',
})
stream.pipe(zlib.createGzip()).pipe(res)
return
}
res.statusCode = 404
res.end()
})
server.listen(3000, async () => {
console.log('Server is running on http://localhost:3000')
const response = await fetch('http://localhost:3000/resource', {
headers: { 'Accept-Encoding': 'gzip' },
})
console.log(await response.text())
})
I receive the file's content decoded. I don't do the compression though.
@kettanaito
I receive the file's content decoded. I don't do the compression though.
What do you mean? You do compress the response in the server
@kettanaito I added an unmocked test which passes.
Thank you, @mikicho! Will look at this.
We do need to add Content-Encoding
handling to the mocked responses in the fetch interceptor. Before proceeding, I've opened an issue in Undici to see if they would be willing to abstract their handling into a utility function we can consume directly: https://github.com/nodejs/undici/issues/3412. That would be the best case scenario.
If not, we can integrate their handling into Interceptors manually, although that'd be a repetition I'd rather not introduce.
I realized we still need to duplicate the Undici's decompression handling because it's unlikely to be backported to Node v18 and even v20, at this point. I certainly don't have the capacity to see those backports through.
The handling itself is rather straightforward, it's just nice to be consistent. We will afford that nicety once Node.js v22 becomes the minimal supported version across MSW's toolchain.
@mikicho, if you have time, can you give me a hand with this? I've already abstracted the decompression logic in the Unidic's PR, making it work better with ReadableStream
:
We basically need this decompress
and the surrounding utilities. We can also merge it all in a single function, there's no big value keeping it separated.
We can also consider using the standard Compression Stream API instead of relying on Undici. I will explore this option.
Opened a pull request that utilizes the Compression Streams API to decompress fetch responses: https://github.com/mswjs/interceptors/pull/661. This works for GZIP and Deflate, but has no Brotli support.
@mikicho, would be curious to hear your thoughts on this one.
Edit: Supporting Brotli seems like a lot of work. We can either release decoding without it (just GZIP and Deflate), and work on it in the background, or block this feature until we find a way to make Brotli happen in Node.js and the browser (it is possible via brotli-wasm
, it just needs to be distributed properly).
@kettanaito I think we should check the environment, and if running in Node.js, use zlib
, otherwise use a browser-compatible library.
WDYT?
@mikicho, that may be problematic. The environment check is fine, the module difference is problematic. We either need to ship two different FetchInterceptors
for Node.js and the browser, or somehow hack a way to import different implementation of the decompression.
That's non-trivial, unfortunately, since we have to support both CJS and ESM at the moment. Dynamic requires will throw when compiling to ESM (imports must be static). We can give this a try.
The problem remains that the browser-compatible, web-stream-friendly Brotli decompression is non-existent right now. brotli-wasm
is our best bet, if they agree to fix the export conditions and also bundle for Node.js ESM.
@kettanaito I see.
I see it is a bit hacky, but what about a different tsconfig
with path alias?
I like this approach because it may solve other environment-specific code problems that may occur in the future in an elegant way.
I've finished the decompression support, please see the conclusion here: https://github.com/mswjs/interceptors/pull/661#issuecomment-2429045746.
Will merge once the tests are passing.
@kettanaito LGTM
Just curious about PipelineStream
. I'm not sure why we need this.
Thanks for working on this!
@mikicho, I added PipelineStream
as an alternative for pipeline
in Node.js but for web streams. We use it to compose a dynamic list of TransformStream
(decompressors) into a chain of streams to apply to the response body.
Do you see a better approach here?
Ohh Ok! I see.. Not sure if I'd stick with Web API here (DecompressionStream) because AFAIU this code is used only by Node.js, but the current implementation is great too.
It's good to be consistent across the browser and Node.js, and the DecompressionStream
is available in both of those environments. We are also working with ReadableStream
, which is the response body. It's easier to convert zlib
to a web stream than convert everything else to a Node.js stream.
@kettanaito I think there is something with the decompression order.. still looking to understand the root cause (if any)
Right now, we're applying decompression left-to-right based on the content-encoding
header:
Maybe that's the problem?
Basically:
content-encoding: gzip, br
The decompression chain would be: input -> DecompressionStream('gzip') -> BrotliDecompression -> output.
This behavior looks correct; I'm not sure the code does it.
From what I see:
compressResponse(['gzip', 'br'])('hello world').toString('base64')); // H4sIAAAAAAAAA+NmbchIzcnJVyjPL8pJYQYAhavvgA8AAAA=
Produce a string that is br -> gzip, instead of gzip -> be
In contrary:
zlib.brotliCompressSync(zlib.gzipSync('hello world')).toString('base64'); // Cw+AH4sIAAAAAAAAA8tIzcnJVyjPL8pJAQCFEUoNCwAAAAM=
Your example is incorrect. compressResponse
behaves similarly to how a server handles the Content-Encoding
header. It lists all the encodings in the order they were applied, left-to-right. So, for your example, the order would be gzip, then brotli.
Decompression must be reversed, and there's a bug currently that I will fix soon.
DecompresionStream
in Node.js v18-20For some reason, the instances of DecompressionStream
in Node.js v18-20 are printed as undefined
when logging them out. They are still correct transform streams, just not observable in the console. Extremely confusing when debugging.
IIRC, Content-Encoding
header is "in the order they were applied", so the string is first compressed to gzip, and only then compressed to br, so the decompression should go backward.
Here's a fun find
I saw that too lol. Seems like 20.18.0
(latest) works as expected
Your example is incorrect.
Why not? If I compressed by gzip, br
, I expect to decompress by be -> gzip. Isn't it?
I think we mean different things by ->
. I mean order, you seem to mean relation. Composition is easier:
Content-Encoding: gzip, br
br(gzip(data))
@mikicho, I've pushed the fix that applies the decompression right-to-left to respect the right order of multiple compressions. Mocked tests pass, but I have two bypass tests failing:
The error seems to be coming from Undici unable to handle that compression:
TypeError: terminated
❯ Fetch.emit ../node:events:518:28
Caused by: Error: incorrect header check
❯ Zlib.zlibOnError [as onerror] ../node:zlib:189:17
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { errno: -3, code: 'Z_DATA_ERROR' }
Our interceptor is not involved here at all. Am I constructing an invalid compressed response with compressResponse()
? 🤔 I've fixed that function, too, to handle correct order of compression.
I can confirm that compressResponse(['gzip', 'deflate'], 'hello world')
returns the same buffer as zlib.deflateSync(zlib.gzipSync('hello world'))
:
decompressResponse: eJyT7+ZgAAPh0x5nT54M1zivf8qTkaFV0IuXGygMAICXB8E=
zlib: eJyT7+ZgAAPh0x5nT54M1zivf8qTkaFV0IuXGygMAICXB8E=
I fixed the compose
function by removing it. The compose order is correct.
The issue is in Undici, I can reproduce it standalone as well. Will report it to their repo.
Edit: Opened an issue at https://github.com/nodejs/undici/issues/3762.
Since the behavior is reproducible in isolation, our side of the implementation seems to be done. The intention of bypass tests is to ensure the interceptor doesn't interfere with the regular behavior of the request client, but in this case the request client has a bug so we cannot allow that to influence our tests.
I will skip those two scenarios and link the issue above them. Will do that after we discuss the issue with the Undici folks.
This looks good to me. @mikicho, should we merge this?
🚀
This has been released in v0.36.6!
Make sure to always update to the latest version (npm i @mswjs/interceptors@latest
) to get the newest features and bug fixes.
Predictable release automation by @ossjs/release.
Added failing test.
@kettanaito
interceptors
is cross-platform and AFAIKzlib
does not exist in the browser. How does the browser decompress the response body? Am I missing something?IncomingMessage
does not decompress the response body, and Node.js leaves this for the HTTP libraries (Axios, got, etc.). So, I think we should follow this behavior and leave the response body compressed.Summary
zlib
and a custom transform stream to support Brotli decompression in Node.js.