nodejs / undici

An HTTP/1.1 client, written from scratch for Node.js
https://nodejs.github.io/undici
MIT License
6.13k stars 531 forks source link

Expose "Content-Encoding" handling publicly #3412

Open kettanaito opened 2 months ago

kettanaito commented 2 months ago

This would solve...

Hi! In Interceptors, we are looking at improving the compatibility of response handling for mocked responses. As a part of that effort, we'd like to produce an identical behavior that Undici exhibits with compressed response bodies.

The implementation should look like...

If the following handling logic is a tad abstracted and exposed as a standalone function, it could help us a lot to stay consistent and make countless developers' tests more reliable:

https://github.com/nodejs/undici/blob/c1f7d2a9a6556e5a34fdf5ed38c77ce636f1ff02/lib/web/fetch/index.js#L2139-L2161

This is a brief look at the implementation. It may require more effort than just abstracting this code block. We can discuss the exact API in detail together. I imagine it being something like decode(response: Response): Decoder[]. Perhaps exposing the pipeline would make for an even easier consumption.

I have also considered...

We are considering writing the Content-Encoding handling by hand but I'd like to join efforts on this one. I don't wish to repeat things. I like using the platform. I like Undici.

Additional context

I'm interested in opening a pull request and seeing this through, given it's something you agree on. Let me know.

mcollina commented 2 months ago

@KhafraDev wdyt?

KhafraDev commented 2 months ago

Sounds fine to me.

kettanaito commented 2 months ago

Thank you for your timely responses, @mcollina, @KhafraDev!

I think a utility function like this would make the most sense:

export function decompress(response: Response) {
  const decoders = []
  // ...populate the `codings` array based on the passed response.

  // Expose the entire pipeline since `pipeline` itself is likely
  // too low-level to export from Undici. These would also be called
  // together anyway. 
  return pipeline(response.body, ...decoders, () => {})
}

I'd love to hear your thoughts on whether response instance is enough to decide that. I can see request references in that code snippet above as well.

KhafraDev commented 2 months ago

3274 might solve this, I'd need to see some uses for a standalone decompress.

kettanaito commented 2 months ago

On the consumption side of things, it would be nice if this behavior was imported from Node.js to be consistent with whichever Undici version is used in a particular Node.js version. In other words, so it would always align with the Node.js version used by the consumer of Interceptors.

Granted, the decompression logic isn't supposed to change often. I'm mostly thinking bug fixes here.

kettanaito commented 2 months ago

@KhafraDev, would that allow us to use decompression without any sort of Undici interceptor being created? That'd be ideal. The functionality itself doesn't need an interceptor, it can be a utility function that deals with the body transformations based on the Content-Encoding header. Compressed body in, properly decompressed body out. Then, the same utility can be reused in the interceptor pull request you've mentioned, I'd hope.

PandaWorker commented 2 months ago

I think we need to split compression and decompression into separate interceptors and add decompression parameters to request options the decompress: boolean, compress: boolean. Also enable them by default

import { request } from 'undici';

async function test(){
  const resp = await request('https://example.site/', {
    method: 'POST',
    body: `qwerty123`,

    decompress: true,
    compress: true
  });

  // decompressed response data
  return await resp.body.text();
}

OR

import { request, Agent, interceptors } from 'undici';

const dispatcher = new Agent().compose(interceptors.decompress()).compose(interceptors.compress(['gzip', 'br']));

async function test(){
  const resp = await request('https://example.site/', {
    method: 'POST',
    body: `qwerty123`,

    decompress: true,
    compress: true,
    dispatcher: dispatcher
  });

  // decompressed response data
  return await resp.body.text();
}
kettanaito commented 2 months ago

@PandaWorker, as long as the decompression is exposed as a standalone utility. This is what this proposal is about. The way we mean to use it in Interceptors (which are not Undici interceptors, pardon for the naming confusion) is with arbitrary response bodies. What I'm proposing here is likely more low-level than what you have in mind, so two may combine, in the end.

metcoder95 commented 2 months ago

If the goal is to expose the decompression as utility #3274 won't be what might close this issue, as it attempts to expose it through an interceptor to be used with dispatcher.compose.

They are slightly interconnected but not necessarily the same (specially because this issue aims to have an utility that works with WHATWG Response object I assume). So a PR aiming just for that will be the best

kettanaito commented 2 months ago

@metcoder95, agreed. #3274 can reuse the utility function I am bringing up as a part of this suggestion. I'm glad to hear we are aligned on that!

If there are no objections, I can give this a go when I have a minute (likely next week). A code review and a bit of guidance will be appreciated.

kettanaito commented 2 months ago

Got excited by this, started with it sooner. The pull request is at https://github.com/nodejs/undici/pull/3423, extremely early.

mikicho commented 4 days ago

Hey there :) Can I help merge this in?

kettanaito commented 3 days ago

Hey, @mikicho 👋 Yes! I'm short on time these weeks, but I'm keeping a list of remaining todos on the pull request: https://github.com/nodejs/undici/pull/3423.

I could use some help with type-testing the newly added utilities and looking into this odd issue that somehow unrelated tests are failing now just because I moved the utilities under a different directory.