honojs / hono

Web framework built on Web Standards
https://hono.dev
MIT License
19.86k stars 566 forks source link

Proxying compressed response do not work out of the box #3518

Open haochenx opened 1 week ago

haochenx commented 1 week ago

What version of Hono are you using?

4.6.4

What runtime/platform is your app running on?

Local Dev Server (with @hono/vite-dev-server) and Cloudflare Workers Local Dev Server (via wrangler dev)

What steps can reproduce the bug?

run

Part 1

having the following middleware:

app.use("*", async (c, next) => {
  function shouldProxy() {
    const pathname = c.req.path;
    return (
      pathname.startsWith("/api/") ||
    );
  }
  if (c.env.IS_VITE_DEVSERVER && shouldProxy()) {
    const target = new URL(c.req.url);
    target.port = "8787";
    console.log(`proxying to ${target.toString()}`);
    const response = await fetch(new Request(target, c.req.raw));
    return new Response(response.body, response);
  } else {
    return next();
  }
});

now, attempt to access localhost:5173/api/hello (e.g. using httpie command ``) gives the following error:

http -v GET :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
Connection: keep-alive
Date: Mon, 14 Oct 2024 14:59:42 GMT
Keep-Alive: timeout=5
content-encoding: gzip
content-type: text/plain;charset=UTF-8
transfer-encoding: chunked

http: error: ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))

on the vite devserver log, we can see the following output:

proxying to http://localhost:8787/api/hello

note that directly accessing http://localhost:8787/api/hello gives the following result:

http -v GET :8787/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8787
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: text/plain;charset=UTF-8
Transfer-Encoding: chunked

hi!

Part 2

if the middleware is modified so that the "content-encoding" header is stripped before returning, everything works fine. e.g. changing the returning line to the following:

    return new Response(response.body, {
      ...response,
      headers: [...response.headers.entries()].filter(
        ([k]) => k.toLocaleLowerCase() !== "content-encoding",
      ),
    });

gives the expected result.

What is the expected behavior?

manually stripping the "content-encoding" from the proxy'ed response should not be required for the proxying route to work.

What do you see instead?

(explained above)

Additional information

No response

yusukebe commented 1 week ago

HI @haochenx

I think this will be caused by Wrangler automatically compressing the content and adding a content-encoding header. Does changing the proxy address, such as https://example.com work? If so, it's not a Hono-side bug.

haochenx commented 4 days ago

Hi @yusukebe,

Thanks for taking a look and my apologies for the late reply. I can confirm that if the upstream does not use compression, hono will behave as expected. Yet if the upstream uses compression, even if the upstream not being wrangler, proxied response from hono will be corrupted.

As (1) my example follows hono's document's recommended way of implementing proxying (ref https://hono.dev/examples/proxy); and (2) apparently the request made by fetch indicates the support of compression (I observed "Accept-Encoding": "gzip, deflate" in my case with a vite dev server)

I believe most developers will expect hono decompress the response body from the upstream if necessary before sending it to the original requester.

yusukebe commented 4 days ago

@haochenx Thanks for the explanation.

I haven't investigated it yet, but it seems to be a problem with the @hono/node-server. The error is not thrown when I run it on Deno and Bun.

Hi @usualoma I'll work on it later, but if you can, can you take a look?

haochenx commented 4 days ago

Thanks for taking another look!

I believe my vite dev server is running on bun (version 1.1.3). it's started by running bun run dev where the "dev" script is defined as simply vite.

I will try it with latest version of bun later and let you know. If I can reproduce, (it may take a few days, but) I will make a minimal reproducing repo for you to investigate.

I am a little busy recently so I cannot promise commitment, but I will try my best to help your investigation.

yusukebe commented 4 days ago

@haochenx

@hono/vite-dev-server is using @hono/node-server internal. So I think it's a problem with @hono/node-server though I've not investigated the details.

haochenx commented 4 days ago

I see. I'll take a look when I get time.

yusukebe commented 4 days ago

@haochenx

Thanks for taking a look and my apologies for the late reply. I can confirm that if the upstream does not use compression, hono will behave as expected. Yet if the upstream uses compression, even if the upstream not being wrangler, proxied response from hono will be corrupted.

Can you share target URLs to reproduce the problem other than http://localhost:8787?

haochenx commented 4 days ago

Can you share target URLs to reproduce the problem other than http://localhost:8787?

Sure! I was using our company's homepage for testing:

reproduce the bug with https://kxc.inc/404:

app.use("*", async (c, next) => {
  const response = await fetch(
    new Request("https://kxc.inc/404", c.req.raw),
  );
  return new Response(response.body, response);
});

this will give the same error as with wrangler:

$ http -v :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 404 Not Found
alt-svc: h3=":443"; ma=86400
cf-cache-status: BYPASS
cf-ray: 8d70efd1beb28341-KIX
connection: keep-alive
content-encoding: gzip
content-type: text/html
date: Wed, 23 Oct 2024 10:10:21 GMT
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=Gw7xhg8the3iz2c0DOtS8yVpep4brIwjULZOmv9Sjm0FVwogxVCWh2I%2FzseyUi3i0bu1G1%2FRUZG8xdKqXrow04lEUKMuIpsOhNrrynpOcrR46qPjuobRIyw4"}],"group":"cf-nel","max_age":604800}
server: cloudflare
strict-transport-security: max-age=31536000; includeSubDomains; preload
transfer-encoding: chunked
vary: Accept-Encoding

http: error: ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))

reproduce the bug with https://kxc.inc/404:

In the mean while, a site does not support compression (e.g. http://httpbin.org/get) works just fine:

app.use("*", async (c, next) => {
  const response = await fetch(
    new Request("http://httpbin.org/get", c.req.raw),
  );
  return new Response(response.body, response);
});
$ http -v :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
access-control-allow-credentials: true
access-control-allow-origin: *
cf-team: 235a74957d0000834140024400000001
connection: keep-alive
content-length: 316
content-type: application/json
date: Wed, 23 Oct 2024 10:10:45 GMT
server: gunicorn/19.9.0

{
    "args": {},
    "headers": {
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate",
        "Accept-Language": "*",
        "Connection": "keep-alive",
        "Host": "httpbin.0k",
        "Sec-Fetch-Mode": "cors",
        "User-Agent": "HTTPie/3.2.2"
    },
    "origin": "172.18.0.1",
    "url": "http://httpbin.0k/get"
}
haochenx commented 4 days ago

btw I just noticed that using curl -v localhost:5173/api/hello works just fine but curl --compressed -v localhost:5173/api/hello results in curl: (23) Failed reading the chunked-encoded stream.

one trigger for the buggy behavior might be the presence of Accept-Encoding header in the request

curl -v localhost:5173/api/hello output

curl -v localhost:5173/api/hello
* Host localhost:5173 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:5173...
* Connected to localhost (::1) port 5173
> GET /api/hello HTTP/1.1
> Host: localhost:5173
> User-Agent: curl/8.9.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< alt-svc: h3=":443"; ma=86400
< cf-cache-status: BYPASS
< cf-ray: 8d70faf718598341-KIX
< connection: keep-alive
< content-encoding: br
< content-type: text/html
< date: Wed, 23 Oct 2024 10:17:57 GMT
< nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=Wn2uyWpiBXTBV6gbyHp50%2BpD%2B8W23dgcyhIla%2Ffmaxzm8Dpc9dCdl7r%2BqagdmB0zNTZgcm5Mdez9nSLbVDTJ7Jmc%2Fabq7%2FveCZIaTINOlQmerSXmMSbszWR7"}],"group":"cf-nel","max_age":604800}
< server: cloudflare
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< transfer-encoding: chunked
< vary: Accept-Encoding
<
* Connection #0 to host localhost left intact
<!DOCTYPE html><html lang="ja"> <head><meta charset="UTF-8"><meta name="viewport" content="width=device-width"><title>Not Found - web.kxc</title><meta content="Kotoi-Xie Consultancy, Inc. Homepage - Not Found" name="description"/><link rel="stylesheet" href="/_astro/0.k-JdlHC7.css" /><script type="module" src="/_astro/page.Sg4V0Ns0.js"></script></head> <body> <div class="_topContainer_1wfmb_1"><div class="_mainContainer_1wfmb_25"><h1 class="_mainStatusCode_1wfmb_30">404</h1><div class="_mainReasonContainer_1wfmb_41"><div class="_mainReason_1wfmb_41"><h2>This page could not be found.</h2><h2>このページが見つかりませんでした。</h2></div></div></div><div><a href="/">Top Page / トップページ</a></div></div> </body></html><script>import("/@vite/client")</script>

curl --compressed -v localhost:5173/api/hello output

curl --compressed -v localhost:5173/api/hello
* Host localhost:5173 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:5173...
* Connected to localhost (::1) port 5173
> GET /api/hello HTTP/1.1
> Host: localhost:5173
> User-Agent: curl/8.9.1
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< alt-svc: h3=":443"; ma=86400
< cf-cache-status: BYPASS
< cf-ray: 8d70fbe08dec8341-KIX
< connection: keep-alive
< content-encoding: br
< content-type: text/html
< date: Wed, 23 Oct 2024 10:18:35 GMT
< nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=oEznor%2FDNUGQuiU3RYd4TILXMsANXFOj4J8%2BUlT4hoM9VeMxKPX4%2FoBaXW6tPIS61dv5L9VOvqSXHJBs69wrI1N4Xi4yWMJUnzdft1SLDABcWp%2BPCDyGng6V"}],"group":"cf-nel","max_age":604800}
< server: cloudflare
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< transfer-encoding: chunked
< vary: Accept-Encoding
<
* Failed reading the chunked-encoded stream
* closing connection #0
curl: (23) Failed reading the chunked-encoded stream
haochenx commented 4 days ago

I just tried the newest version of bun (namely, 1.1.32) and the behavior is the same (as expected, i guess.)

yusukebe commented 4 days ago

Thanks. As I said above, it's @hono/node-server matter. You can reproduce it with this code without @hono/vite-dev-server:

import { Hono } from 'hono'
import { serve } from '@hono/node-server'

const app = new Hono()

app.use('*', async (c, next) => {
  const response = await fetch(new Request('https://kxc.inc/404', c.req.raw))
  return new Response(response.body, response)
})

serve(app)
haochenx commented 4 days ago

Not sure whether it matters at this point, but I just extracted relevant code and made a reproduction repository: https://github.com/haochenx/hono-3518-reproduce

edit: added https://github.com/haochenx/hono-3518-reproduce/blob/main/src/minimal.ts

haochenx commented 4 days ago

Thanks. I reproduced the problem with only @hono/node-server and added it to the reproduction repo.

I also changed the upstream to http://kxc.inc/404 (https is giving me SSL errors on the server side..) and it gives the same behavior as far as this issue matters.

I have to go for today but I might take a look at the root cause when I got time in the coming days.

yusukebe commented 4 days ago

I can found that overriding fetch in @hono/node-server causes the problem:

https://github.com/honojs/node-server/blob/cd4d4d7a6d33532713086c2e896d133dc5f5e3e5/src/globals.ts#L14

As a result, the following works well:

import { Hono } from 'hono'
import { serve } from '@hono/node-server'

const app = new Hono()

app.get('*', async (c) => {
  const response = await fetch(
    new Request('https://kxc.inc/404', {
      compress: true,
      ...c.req.raw
    })
  )
  return new Response(response.body, response)
})

serve(app)

@usualoma any thoughts?

usualoma commented 3 days ago

I don't think this is a @hono/node-server issue. I got the same error when I ran the following code with bun.

import { Hono } from 'hono'

const app = new Hono()
app.get('/', () => fetch('https://example.com/'))

export default app

The following code will not produce an error. (The same error suppression will also be applied when using @hono/node-server.)

import { Hono } from 'hono'

const app = new Hono()
app.get('/', () =>
  fetch('https://example.com/', {
    headers: {
      'Accept-Encoding': '',
    },
  })
)

export default app

Currently, when using fetch() with any of the runtimes, a request header like Accept-Encoding: gzip, deflate, br is added by default, and if the server supports it, it is returned as Content-Encoding: gzip. However, the data read via the Response object's response.body() is decoded. (For reference, the content-encoding was removed from the response header in deno.)

This may not be a problem that hono should solve as a framework.

haochenx commented 3 days ago

However, the data read via the Response object's response.body() is decoded.

if this semantics is guaranteed, which I guess it should be according to web standard, my workaround is probably the proper way to handle this. If so, the best course of action might be to warn developers about this behavior in the document and introduce the workaround (and potentially provide an easier way to perform this workaround, eg by something in the line of adding a option item in the Responses’ constructor)

usualoma commented 3 days ago

Hi @haochenx, thank you for your comment.

Yes, I agree with you.

my workaround is probably the proper way to handle this

It would be good to have something like middleware or a utility method to make your workarounds possible with simpler descriptions, but it isn't easy to find a good compromise.

yusukebe commented 2 days ago

@usualoma

Thank you for the explanation! It's okay to remove the Accept-Encoding header from fetch when it accesses the origin.

@haochenx

So, I think the following code is simple and good. What do you think of it?

import { serve } from '@hono/node-server'
import { Hono } from 'hono'

const app = new Hono()

const target = 'https://kxc.inc/404'

app.get('/', (c) => {
  const req = new Request(target, c.req.raw)
  req.headers.delete('Accept-Encoding')
  return fetch(req)
})

serve(app)

If it's fine, we may add the description on the proxy page on the website: https://hono.dev/examples/proxy