honojs / hono

Web framework built on Web Standards
https://hono.dev
MIT License
20.53k stars 595 forks source link

Proxying compressed response do not work out of the box #3518

Open haochenx opened 1 month ago

haochenx commented 1 month ago

What version of Hono are you using?

4.6.4

What runtime/platform is your app running on?

Local Dev Server (with @hono/vite-dev-server) and Cloudflare Workers Local Dev Server (via wrangler dev)

What steps can reproduce the bug?

run

Part 1

having the following middleware:

app.use("*", async (c, next) => {
  function shouldProxy() {
    const pathname = c.req.path;
    return (
      pathname.startsWith("/api/") ||
    );
  }
  if (c.env.IS_VITE_DEVSERVER && shouldProxy()) {
    const target = new URL(c.req.url);
    target.port = "8787";
    console.log(`proxying to ${target.toString()}`);
    const response = await fetch(new Request(target, c.req.raw));
    return new Response(response.body, response);
  } else {
    return next();
  }
});

now, attempt to access localhost:5173/api/hello (e.g. using httpie command ``) gives the following error:

http -v GET :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
Connection: keep-alive
Date: Mon, 14 Oct 2024 14:59:42 GMT
Keep-Alive: timeout=5
content-encoding: gzip
content-type: text/plain;charset=UTF-8
transfer-encoding: chunked

http: error: ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))

on the vite devserver log, we can see the following output:

proxying to http://localhost:8787/api/hello

note that directly accessing http://localhost:8787/api/hello gives the following result:

http -v GET :8787/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8787
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: text/plain;charset=UTF-8
Transfer-Encoding: chunked

hi!

Part 2

if the middleware is modified so that the "content-encoding" header is stripped before returning, everything works fine. e.g. changing the returning line to the following:

    return new Response(response.body, {
      ...response,
      headers: [...response.headers.entries()].filter(
        ([k]) => k.toLocaleLowerCase() !== "content-encoding",
      ),
    });

gives the expected result.

What is the expected behavior?

manually stripping the "content-encoding" from the proxy'ed response should not be required for the proxying route to work.

What do you see instead?

(explained above)

Additional information

No response

yusukebe commented 1 month ago

HI @haochenx

I think this will be caused by Wrangler automatically compressing the content and adding a content-encoding header. Does changing the proxy address, such as https://example.com work? If so, it's not a Hono-side bug.

haochenx commented 1 month ago

Hi @yusukebe,

Thanks for taking a look and my apologies for the late reply. I can confirm that if the upstream does not use compression, hono will behave as expected. Yet if the upstream uses compression, even if the upstream not being wrangler, proxied response from hono will be corrupted.

As (1) my example follows hono's document's recommended way of implementing proxying (ref https://hono.dev/examples/proxy); and (2) apparently the request made by fetch indicates the support of compression (I observed "Accept-Encoding": "gzip, deflate" in my case with a vite dev server)

I believe most developers will expect hono decompress the response body from the upstream if necessary before sending it to the original requester.

yusukebe commented 1 month ago

@haochenx Thanks for the explanation.

I haven't investigated it yet, but it seems to be a problem with the @hono/node-server. The error is not thrown when I run it on Deno and Bun.

Hi @usualoma I'll work on it later, but if you can, can you take a look?

haochenx commented 1 month ago

Thanks for taking another look!

I believe my vite dev server is running on bun (version 1.1.3). it's started by running bun run dev where the "dev" script is defined as simply vite.

I will try it with latest version of bun later and let you know. If I can reproduce, (it may take a few days, but) I will make a minimal reproducing repo for you to investigate.

I am a little busy recently so I cannot promise commitment, but I will try my best to help your investigation.

yusukebe commented 1 month ago

@haochenx

@hono/vite-dev-server is using @hono/node-server internal. So I think it's a problem with @hono/node-server though I've not investigated the details.

haochenx commented 1 month ago

I see. I'll take a look when I get time.

yusukebe commented 1 month ago

@haochenx

Thanks for taking a look and my apologies for the late reply. I can confirm that if the upstream does not use compression, hono will behave as expected. Yet if the upstream uses compression, even if the upstream not being wrangler, proxied response from hono will be corrupted.

Can you share target URLs to reproduce the problem other than http://localhost:8787?

haochenx commented 1 month ago

Can you share target URLs to reproduce the problem other than http://localhost:8787?

Sure! I was using our company's homepage for testing:

reproduce the bug with https://kxc.inc/404:

app.use("*", async (c, next) => {
  const response = await fetch(
    new Request("https://kxc.inc/404", c.req.raw),
  );
  return new Response(response.body, response);
});

this will give the same error as with wrangler:

$ http -v :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 404 Not Found
alt-svc: h3=":443"; ma=86400
cf-cache-status: BYPASS
cf-ray: 8d70efd1beb28341-KIX
connection: keep-alive
content-encoding: gzip
content-type: text/html
date: Wed, 23 Oct 2024 10:10:21 GMT
nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=Gw7xhg8the3iz2c0DOtS8yVpep4brIwjULZOmv9Sjm0FVwogxVCWh2I%2FzseyUi3i0bu1G1%2FRUZG8xdKqXrow04lEUKMuIpsOhNrrynpOcrR46qPjuobRIyw4"}],"group":"cf-nel","max_age":604800}
server: cloudflare
strict-transport-security: max-age=31536000; includeSubDomains; preload
transfer-encoding: chunked
vary: Accept-Encoding

http: error: ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))

reproduce the bug with https://kxc.inc/404:

In the mean while, a site does not support compression (e.g. http://httpbin.org/get) works just fine:

app.use("*", async (c, next) => {
  const response = await fetch(
    new Request("http://httpbin.org/get", c.req.raw),
  );
  return new Response(response.body, response);
});
$ http -v :5173/api/hello
GET /api/hello HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:5173
User-Agent: HTTPie/3.2.2

HTTP/1.1 200 OK
access-control-allow-credentials: true
access-control-allow-origin: *
cf-team: 235a74957d0000834140024400000001
connection: keep-alive
content-length: 316
content-type: application/json
date: Wed, 23 Oct 2024 10:10:45 GMT
server: gunicorn/19.9.0

{
    "args": {},
    "headers": {
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate",
        "Accept-Language": "*",
        "Connection": "keep-alive",
        "Host": "httpbin.0k",
        "Sec-Fetch-Mode": "cors",
        "User-Agent": "HTTPie/3.2.2"
    },
    "origin": "172.18.0.1",
    "url": "http://httpbin.0k/get"
}
haochenx commented 1 month ago

btw I just noticed that using curl -v localhost:5173/api/hello works just fine but curl --compressed -v localhost:5173/api/hello results in curl: (23) Failed reading the chunked-encoded stream.

one trigger for the buggy behavior might be the presence of Accept-Encoding header in the request

curl -v localhost:5173/api/hello output

curl -v localhost:5173/api/hello
* Host localhost:5173 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:5173...
* Connected to localhost (::1) port 5173
> GET /api/hello HTTP/1.1
> Host: localhost:5173
> User-Agent: curl/8.9.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< alt-svc: h3=":443"; ma=86400
< cf-cache-status: BYPASS
< cf-ray: 8d70faf718598341-KIX
< connection: keep-alive
< content-encoding: br
< content-type: text/html
< date: Wed, 23 Oct 2024 10:17:57 GMT
< nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=Wn2uyWpiBXTBV6gbyHp50%2BpD%2B8W23dgcyhIla%2Ffmaxzm8Dpc9dCdl7r%2BqagdmB0zNTZgcm5Mdez9nSLbVDTJ7Jmc%2Fabq7%2FveCZIaTINOlQmerSXmMSbszWR7"}],"group":"cf-nel","max_age":604800}
< server: cloudflare
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< transfer-encoding: chunked
< vary: Accept-Encoding
<
* Connection #0 to host localhost left intact
<!DOCTYPE html><html lang="ja"> <head><meta charset="UTF-8"><meta name="viewport" content="width=device-width"><title>Not Found - web.kxc</title><meta content="Kotoi-Xie Consultancy, Inc. Homepage - Not Found" name="description"/><link rel="stylesheet" href="/_astro/0.k-JdlHC7.css" /><script type="module" src="/_astro/page.Sg4V0Ns0.js"></script></head> <body> <div class="_topContainer_1wfmb_1"><div class="_mainContainer_1wfmb_25"><h1 class="_mainStatusCode_1wfmb_30">404</h1><div class="_mainReasonContainer_1wfmb_41"><div class="_mainReason_1wfmb_41"><h2>This page could not be found.</h2><h2>このページが見つかりませんでした。</h2></div></div></div><div><a href="/">Top Page / トップページ</a></div></div> </body></html><script>import("/@vite/client")</script>

curl --compressed -v localhost:5173/api/hello output

curl --compressed -v localhost:5173/api/hello
* Host localhost:5173 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:5173...
* Connected to localhost (::1) port 5173
> GET /api/hello HTTP/1.1
> Host: localhost:5173
> User-Agent: curl/8.9.1
> Accept: */*
> Accept-Encoding: deflate, gzip, br, zstd
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< alt-svc: h3=":443"; ma=86400
< cf-cache-status: BYPASS
< cf-ray: 8d70fbe08dec8341-KIX
< connection: keep-alive
< content-encoding: br
< content-type: text/html
< date: Wed, 23 Oct 2024 10:18:35 GMT
< nel: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
< report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=oEznor%2FDNUGQuiU3RYd4TILXMsANXFOj4J8%2BUlT4hoM9VeMxKPX4%2FoBaXW6tPIS61dv5L9VOvqSXHJBs69wrI1N4Xi4yWMJUnzdft1SLDABcWp%2BPCDyGng6V"}],"group":"cf-nel","max_age":604800}
< server: cloudflare
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< transfer-encoding: chunked
< vary: Accept-Encoding
<
* Failed reading the chunked-encoded stream
* closing connection #0
curl: (23) Failed reading the chunked-encoded stream
haochenx commented 1 month ago

I just tried the newest version of bun (namely, 1.1.32) and the behavior is the same (as expected, i guess.)

yusukebe commented 1 month ago

Thanks. As I said above, it's @hono/node-server matter. You can reproduce it with this code without @hono/vite-dev-server:

import { Hono } from 'hono'
import { serve } from '@hono/node-server'

const app = new Hono()

app.use('*', async (c, next) => {
  const response = await fetch(new Request('https://kxc.inc/404', c.req.raw))
  return new Response(response.body, response)
})

serve(app)
haochenx commented 1 month ago

Not sure whether it matters at this point, but I just extracted relevant code and made a reproduction repository: https://github.com/haochenx/hono-3518-reproduce

edit: added https://github.com/haochenx/hono-3518-reproduce/blob/main/src/minimal.ts

haochenx commented 1 month ago

Thanks. I reproduced the problem with only @hono/node-server and added it to the reproduction repo.

I also changed the upstream to http://kxc.inc/404 (https is giving me SSL errors on the server side..) and it gives the same behavior as far as this issue matters.

I have to go for today but I might take a look at the root cause when I got time in the coming days.

yusukebe commented 1 month ago

I can found that overriding fetch in @hono/node-server causes the problem:

https://github.com/honojs/node-server/blob/cd4d4d7a6d33532713086c2e896d133dc5f5e3e5/src/globals.ts#L14

As a result, the following works well:

import { Hono } from 'hono'
import { serve } from '@hono/node-server'

const app = new Hono()

app.get('*', async (c) => {
  const response = await fetch(
    new Request('https://kxc.inc/404', {
      compress: true,
      ...c.req.raw
    })
  )
  return new Response(response.body, response)
})

serve(app)

@usualoma any thoughts?

usualoma commented 1 month ago

I don't think this is a @hono/node-server issue. I got the same error when I ran the following code with bun.

import { Hono } from 'hono'

const app = new Hono()
app.get('/', () => fetch('https://example.com/'))

export default app

The following code will not produce an error. (The same error suppression will also be applied when using @hono/node-server.)

import { Hono } from 'hono'

const app = new Hono()
app.get('/', () =>
  fetch('https://example.com/', {
    headers: {
      'Accept-Encoding': '',
    },
  })
)

export default app

Currently, when using fetch() with any of the runtimes, a request header like Accept-Encoding: gzip, deflate, br is added by default, and if the server supports it, it is returned as Content-Encoding: gzip. However, the data read via the Response object's response.body() is decoded. (For reference, the content-encoding was removed from the response header in deno.)

This may not be a problem that hono should solve as a framework.

haochenx commented 1 month ago

However, the data read via the Response object's response.body() is decoded.

if this semantics is guaranteed, which I guess it should be according to web standard, my workaround is probably the proper way to handle this. If so, the best course of action might be to warn developers about this behavior in the document and introduce the workaround (and potentially provide an easier way to perform this workaround, eg by something in the line of adding a option item in the Responses’ constructor)

usualoma commented 1 month ago

Hi @haochenx, thank you for your comment.

Yes, I agree with you.

my workaround is probably the proper way to handle this

It would be good to have something like middleware or a utility method to make your workarounds possible with simpler descriptions, but it isn't easy to find a good compromise.

yusukebe commented 1 month ago

@usualoma

Thank you for the explanation! It's okay to remove the Accept-Encoding header from fetch when it accesses the origin.

@haochenx

So, I think the following code is simple and good. What do you think of it?

import { serve } from '@hono/node-server'
import { Hono } from 'hono'

const app = new Hono()

const target = 'https://kxc.inc/404'

app.get('/', (c) => {
  const req = new Request(target, c.req.raw)
  req.headers.delete('Accept-Encoding')
  return fetch(req)
})

serve(app)

If it's fine, we may add the description on the proxy page on the website: https://hono.dev/examples/proxy

haochenx commented 4 weeks ago

@usualoma @yusukebe Thank you for commenting. I am sorry for being late replying.

Regarding deleting the 'Accept-Encoding' header from the request, I am personally against it, as it would make the proxying much less efficient for large compressible payloads (think about giant javascript files where almost every website is having these days..)

(I'm yet familiar with hono's library organization and naming conventions, so just a suggestion:) I think providing a utility function that provides functional update semantics for headers on the response object e.g. somewhere under hono/utils might provide a cleanest solution. Saying we have a polymorphic helper function withHeaderDeleted whose type being

function withHeaderDeleted(headerName: string, resp: Response): Response;
function withHeaderDeleted(headerName: string, req: Request): Request;
// .. maybe also overloads for other types of objects

, we can write the proxying route as

import { serve } from '@hono/node-server'
import { Hono } from 'hono'
import { withHeaderDeleted } from 'hono/utils/updaters'

const app = new Hono()

const target = 'https://kxc.inc/404'

app.get('/', async (c) => {
  const req = new Request(target, c.req.raw)
  req.headers.delete('Accept-Encoding')
  const resp = await fetch(req);
  return new Response(resq.body, withHeaderDeleted('Content-Encoding', resq));
})

serve(app)

which should both function correctly and take advantage of compression.

Please let me know what you think.

usualoma commented 3 weeks ago

Hi @haochenx

As you say, I think it would be good if I could receive compressed responses from the backend.

If there is demand for support for proxy patterns, I think it would be better to support them at a slightly higher level of abstraction (my preference).

// src/utils/proxy.ts

const ignoreHeadersRe = /^content-(?:encoding|length|range)$/i

// TBD: or perhaps simply named `proxy`
export const proxyFetch: typeof fetch = async (request, options) => {
  const req = new Request(request, options)
  req.headers.delete('accept-encoding') // TBD: there may be cases where you want to explicitly specify
  const res = await fetch(req)
  return new Response(res.body, {
    ...res,
    headers: [...res.headers.entries()].filter(([k]) => !ignoreHeadersRe.test(k)),
  })
}
// app.ts
import { Hono } from 'hono'
import { proxyFetch } from 'hono/utils/proxy'

const target = 'https://example.com'

const app = new Hono()
app.get('/', async (c) => proxyFetch(new Request(target, c.req.raw)))

export default app
yusukebe commented 3 weeks ago

Ah! Or, instead of it, making "Proxy Helper" sounds good!

import { proxyFetch } from 'hono/proxy'
haochenx commented 3 weeks ago

I do think adding a proxy helper is quite reasonable, as it's really not that trivial to get proxying work reliably with only fetch and new Response. I remember having a production server where I ended up with manually crafting the request and response objects to deal with all the quirks (cookie, caching behavior, X-Forwarded-For, cros, etc..)

usualoma commented 3 weeks ago

@haochenx Thanks for your comment! I've made some changes to the header so that it can be adjusted more finely, and created https://github.com/honojs/hono/pull/3589. Would you be willing to review it to see if it solves the problem in your use case?

haochenx commented 3 weeks ago

@usualoma cool! sure, I'm glad to take a look!

KyleAMathews commented 2 weeks ago

I ran into this — is it possible to just disable the decompression Hono is doing? I'm not sure why that's the default?