sveltejs / kit

web development, streamlined
https://svelte.dev/docs/kit
MIT License
18.73k stars 1.94k forks source link

Using fetch to proxy resources fails with net::ERR_CONTENT_DECODING_FAILED error #12197

Open clement-fischer opened 6 months ago

clement-fischer commented 6 months ago

Describe the bug

With 3rd party cookies soon to be blocked for authentication with Firebase, I was trying to create a hook to proxy the requests to Firebase as described here. It's simple to implement, but it leads to errors on the client side (Chrome in my case). Looking at the Network tab in Chrome Dev Tools shows an error in the "Status" column: (failed)net::ERR_CONTENT_DECODING_FAILED.

I added some debugging info to understand the problem and found a mismatch between the size of the request as reported by Chrome Dev Tools vs the Content-Length header in the response to the proxied request.

I think that what is happening is Sveltekit returns the HTTP response body as-is (compressed with gzip), but does not return the associated Content-Encoding: gzip header. Sveltekit does not pass any of the headers from the proxied response Perhaps this is intentional, but I think this is what's causing this issue.

The workaround is to decode the response and create a new one, but this is a little annoying to do because some of the headers may need to be added back. For example in the case of Firebase, the workaround would not work until the 'Content-Type' header was added, the Firebase library could would reject the response without it (no reported error, making it difficult to understand). And it is slower too.

It might be worth to add that I wish there was a better way to proxy these requests instead of doing it in code. I know there is a way with the Vercel adapter using rewrites, and in local development vite can do it, but there is no such feature with node-adapter. If node-adapter had such a feature, I think it'd be more convenient than proxying with a hook. That could be a feature request ticket, but I think the issue described above is a bug.

Reproduction

In any Sveltekit app, add a /wiki route with the following code:

export async function GET() {
    // // WORKAROUND
    // const resp = await fetch('https://en.wikipedia.org/wiki/Main_Page');
    // const t = await resp.text();
    // return new Response(t);

    return await fetch('https://en.wikipedia.org/wiki/Main_Page');
}

When using the code as is, the error will show in the status column of the network tab in Chrome Dev Tools. Using the commented out code works and the raw HTML will show on the page.

Logs

No response

System Info

$ npx envinfo --system --binaries --browsers --npmPackages "{svelte,@sveltejs/*,vite}"
Need to install the following packages:
envinfo@7.13.0
Ok to proceed? (y) y

  System:
    OS: macOS 14.4.1
    CPU: (10) arm64 Apple M1 Max
    Memory: 181.34 MB / 32.00 GB
    Shell: 5.9 - /bin/zsh
  Binaries:
    Node: 20.11.0 - ~/.nvm/versions/node/v20.11.0/bin/node
    npm: 10.2.4 - ~/.nvm/versions/node/v20.11.0/bin/npm
  Browsers:
    Chrome: 124.0.6367.119
    Edge: 124.0.2478.80
    Safari: 17.4.1
  npmPackages:
    @sveltejs/adapter-node: ^5.0.1 => 5.0.1 
    @sveltejs/kit: ^2.5.7 => 2.5.7 
    @sveltejs/vite-plugin-svelte: ^3.1.0 => 3.1.0 
    svelte: ^4.2.16 => 4.2.16

Severity

serious, but I can work around it

Additional Information

No response

paoloricciuti commented 5 months ago

I faced the same problem some times ago and from my research is not an issue with sveltekit. The problem should be that the moment you fetch something in node it will be decoded with the Content-Encoding regardless if you access the body or not.

So the response that you are returning to the browser is not exactly the same request, because node already decoded the body. But since the Content-Encoding header is still set the browser will try to decode with the wrong decoder.

A better solution then this

export async function GET() {
    const resp = await fetch('https://en.wikipedia.org/wiki/Main_Page');
    const t = await resp.text();
    return new Response(t);
}

should be this

export async function GET() {
    const resp = await fetch('https://en.wikipedia.org/wiki/Main_Page');
        // copy the response headers over
        const headers = new Headers(response.headers);
        // delete content-encoding.
        headers.delete('content-encoding');
        return new Response(response.body, {
                headers,
                status: response.status,
                statusText: response.statusText,
        });
}
clement-fischer commented 5 months ago

Thank you for replying @paoloricciuti!

Creating new headers works. I had tried to modify the headers from the response but had an error saying that the headers are immutable. I guess it makes sense now, can't modify headers from a response.

When I tried your solution it worked for my local development (Vite with HTTPS), but did not work when deployed (node-adapter with HTTPS). I was getting the following errornet::ERR_HTTP2_PROTOCOL_ERROR. Apparently this error can be thrown by Chrome when SSL is used and the value from the Content-Length header isn't right. Removing the Content-Length header fixed this problem, and so the solution looks like this:

export async function GET() {
    const resp = await fetch('https://en.wikipedia.org/wiki/Main_Page');
        // copy the response headers over
        const headers = new Headers(resp.headers);
        headers.delete('content-encoding');
        headers.delete('content-length');
        return new Response(resp.body, {
                headers,
                status: resp.status,
                statusText: resp.statusText,
        });
}

I understand the root cause regarding node decoding the compressed response, but I was wondering if this isn't something that could be fixed. From what I understand, fetch from SvelteKit isn't exactly the same as the native fetch web API, so perhaps this decoding can be avoided.

Perhaps proxying isn't something SvelteKit needs to do well, in many cases the proxy can be implemented before a request gets to SvelteKit. Many web servers support this:

In my case this application is deployed behind an Application Load Balancer on AWS, which doesn't support reverse proxying. The best case scenario for me would be that adapter-node supports this, like adapter-vercel does.

kentonv commented 5 months ago

I just ran into this too, and I would argue that it is a problem with SvelteKit, but it's subtle.

The Response API was originally invented as the return type of fetch(), meaning it's used on the client side. SvelteKit wants to use this type on the server side too, so that an application's handler function returns type Response. This is a great idea, as it means people don't need to learn a new API. It's also what many new serverless runtimes (e.g. Cloudflare Workers) do. So far, so good!

But there's a weird inconsistency: If a response has Content-Encoding: gzip, then fetch() automatically decompresses it. So the Response object you get already appears to have a decompressed body.

However, SvelteKit does not implement symmetric logic on the outgoing path. If your application returns a Response with Content-Encoding: gzip, SvelteKit does not automatically compress the body. It assumes the body is already compressed.

This means that a server-side handler function that does return fetch(...) does not work as a way to proxy to some other back-end.

Cloudflare Workers -- of which I'm the lead engineer -- actually realized this very early on, and implemented a solution: When returning a Response with Content-Encoding: gzip, the runtime automatically compresses the response body. Additionally, as an optimization, Workers detects when you are simply passing through a Response from a backend unmodified, and it elides the decompress/recompress round trip. Workers also adds a non-standard option to ResponseInit: encodeBody: "manual" disables auto-encoding. This can be used to construct a Respnose from bytes that are already compressed if needed.

I'm not sure if SvelteKit can implement the same round-trip optimization (it would require reaching into fetch internals), but it should probably at least implement the automatic compression on the way out. This isn't a violation of any standard, since the behavior of using Response on the server side is not standardized. But the client-side standard seems pretty clear that Response's body represents uncompressed data, so the server should probably do the same.

kentonv commented 5 months ago

Note this is specifically a problem in the Node adapter, probably in the setResponse() function:

https://github.com/sveltejs/kit/blob/afd227c7e67d61206749588819314c92031531b8/packages/kit/src/exports/node/index.js#L124

When running on Cloudflare Workers, I think SvelteKit will actually get automatic compression behavior from the platform (just based on looking at the code, haven't tested yet).

paoloricciuti commented 5 months ago

Note this is specifically a problem in the Node adapter, probably in the setResponse() function:

https://github.com/sveltejs/kit/blob/afd227c7e67d61206749588819314c92031531b8/packages/kit/src/exports/node/index.js#L124

When running on Cloudflare Workers, I think SvelteKit will actually get automatic compression behavior from the platform (just based on looking at the code, haven't tested yet).

Nope I had this problem with cloud flare too and that's why I've implemented the fix I sent above

However, SvelteKit does not implement symmetric logic on the outgoing path. If your application returns a Response with Content-Encoding: gzip, SvelteKit does not automatically compress the body. It assumes the body is already compressed.

I think SvelteKit just wrap the fetch provided by the platform is used on...I think compressing the body at the framework level is the wrong place. This should probably be fixed in undici or whichever implementation of fetch the platform is using.

kentonv commented 5 months ago

Nope I had this problem with cloud flare too and that's why I've implemented the fix I sent above

Did you actually see it when running in production, or only in local dev? Because local dev for SvelteKit apps uses Node even when you are targeting Cloudflare for deployment.

This should probably be fixed in undici or whichever implementation of fetch the platform is using.

The problem isn't in fetch()/undici, it's in the server-side handling. This isn't based on any framework, the coed is in SvelteKit, setResponse(), which I linked above.

paoloricciuti commented 5 months ago

Nope I had this problem with cloud flare too and that's why I've implemented the fix I sent above

Did you actually see it when running in production, or only in local dev? Because local dev for SvelteKit apps uses Node even when you are targeting Cloudflare for deployment.

This should probably be fixed in undici or whichever implementation of fetch the platform is using.

The problem isn't in fetch()/undici, it's in the server-side handling. This isn't based on any framework, the coed is in SvelteKit, setResponse(), which I linked above.

I remember having this problem in production but I might be wrong.

The server side handling you linked to is only relative to node tho right?

kentonv commented 5 months ago

The server side handling you linked to is only relative to node tho right?

Right -- but again, it's also the code used in local dev when targeting Cloudflare, at least for now. (We (Workers team) are working towards getting vite to run workerd under the hood!)

The Cloudflare code here appears to just return the Response object the app gave it:

https://github.com/sveltejs/kit/blob/afd227c7e67d61206749588819314c92031531b8/packages/adapter-cloudflare/src/worker.js#L68

The Cloudflare Workers Runtime will compress the response body in this case (or keep it compressed, if proxying).

paoloricciuti commented 5 months ago

Oh ok now i get it...fetch will decode the body only if the body it's used which is what adapter node is doing in setResponse...i was under the impression that fetch would decode regardless. Is my assumption correct?

kentonv commented 5 months ago

Oh ok now i get it...fetch will decode the body only if the body it's used which is what adapter node is doing in setResponse...i was under the impression that fetch would decode regardless. Is my assumption correct?

No no, fetch() always decodes. But in Cloudflare Workers, when you are implementing a server-side request handler that returns a Response, if it has Content-Encoding: gzip, the Workers runtime will encode it, that is, do the reverse of what fetch() does. Thus, it cancels out what fetch() did, so proxying a Response through "just works".

(And also, there is an optimization in Cloudflare Workers where it can detect when you are doing a round trip, and skip the round trip, but that's invisible...)

Simre1 commented 4 months ago

I am also experiencing this issue. It would be neat if there was some way to fetch without decompressing the response body.