hyperium / hyper

An HTTP library for Rust
https://hyper.rs
MIT License
14.57k stars 1.6k forks source link

hyper (via reqwest) reports "invalid HTTP header parsed" #2083

Open spease opened 4 years ago

spease commented 4 years ago

I'm trying to work with an API on an internal server, and hyper keeps blowing away the connection due to "invalid HTTP header parsed". There doesn't appear to be a specific error provided, however, which has so far made it impossible to troubleshoot. I've enabled RUST_LOG=trace and even tried switching off certificate and domain name checks and gzip, is there any way to get it to work or at least figure out what hyper thinks is wrong with the header?

Both curl and Postman appear to work fine, and testssl doesn't reveal any glaring issues.

Thanks.

davidbarsky commented 4 years ago

It might be related to casing or non-ASCII characters in the returned headers. To debug this, I'd suggest trying to place all the headers you get from curl or Postman into an http::header::HeaderMap and seeing which one fails.

If that doesn't work, the rr debugger on Linux could be handy: https://blog.faraday.io/reversing-rust-debugging-a-terrible-error-message-backwards-in-time-with-rr-2/. I believe Windows has something similar, but I can't vouch for it.

spease commented 4 years ago

rr reports it’s unable to open a performance counter under Linux.

This is now a major blocker for a work project. Is there any way to get a better error message about what’s actually failing? Every other http utility I’ve tried doesn’t have any issue with the same endpoint.

seanmonstar commented 4 years ago

Are you able to include the raw bytes that the endpoint normally sends back? Or does it contain private info?

spease commented 4 years ago

Unfortunately it’s an internal authentication endpoint so yeah I don’t think I can send the raw bytes. Right now I’m looking at enhancing the error message to see what’s causing it to fail.

EDIT: Also the failure occurs after I call reqwest’s “send()” so I don’t even get a response object back from reqwest to retrieve the bytes from.

spease commented 4 years ago

Found it ultimately by instrumenting httparse with trace statements. Apparently other people have encountered this issue before:

https://forums.xamarin.com/discussion/60036/soh-char-in-http-response-causes-serverprotocolviolation

I can try to escalate this internally to get the product fixed, but I may need to continue without changing the server. @seanmonstar is there a way I can update httparse to have a way to allow these requests that you’ll accept it upstream?

spease commented 4 years ago

Small update. It looks like the server also does line folding. This appears to be explicitly against the HTTP/1.1 spec, so I'm looking into addressing it server-side.

I think I would morph this issue then into both highlighting a need for better error messages (at the httparse level) and the ability to ignore invalid headers (which appears to be a separate issue in one of the hyper-related projects already) to handle the case of troubleshooting & working around noncompliant servers.

dekellum commented 4 years ago

It would seem reasonable for hyper to maybe log a warning and just omit headers that fail to parse. The only thing that gives me pause with that is if there is any possible security issue with receiving partial set of headers at an application level? I guess if that were the case, then this could still be offered as a non-default configuration item with a warning in the docs?

seanmonstar commented 4 years ago

Regarding debugging, hyper specifically doesn't log the underlying bytes, to prevent any accidental log settings from causing possibly confidential bytes being stored to files. It'd be possible to add logging externally by providing some AsyncRead + AsyncWrite that logs.

Prompted by this issue, I've added an easy option to reqwest to enable this: https://github.com/seanmonstar/reqwest/pull/774

cliffcrosland commented 4 years ago

@seanmonstar - The following HTTP response headers, with broken newlines (just \n instead of \r\n), reproduces the panic:

HTTP/1.0 503\nContent-type: text/html\nContent-length: 41878\nConnection: Keep-Alive\n
thread 'tokio-runtime-worker-0' panicked at 'illegal header name from httparse: b"503\nContent-type"'
sre commented 4 years ago

Hi,

I have hit this error (via reqwest) with a server (Zyxel GS1900-10HP switch) response not containing an empty line after the headers:

HTTP/1.1 200 OK
Date: Tue, 11 Feb 2020 19:55:48 GMT
Server: Hydra/0.1.8
Accept-Ranges: bytes
Connection: close
<html>

[content]</html>

While obviously not following the standard, python's urllib and browser's seem to handle this case. Would be nice to get this working with hyper.

cliffcrosland commented 3 years ago

Especially now that hyper is being used in libraries like Curl for memory safety and stability, it seems that returning errors instead of panicking on invalid HTTP header input is very important.

If hyper panics on invalid HTTP input, using hyper in Curl would make your C/C++ program more prone to crash than using Curl without hyper.

On Sat, May 1, 2021 at 7:24 AM Sergey F. @.***> wrote:

I'm having a panic in hyper's code...

thread 'tokio-runtime-worker' panicked at 'illegal header name from httparse: b"302\nStatus"', /home/dev/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.7/src/proto/h1/role.rs:951:28 stack backtrace: 0: rust_begin_unwind at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:493:5 1: std::panicking::begin_panic_fmt at /rustc/2fd73fabe469357a12c2c974c140f67e7cdd76d0/library/std/src/panicking.rs:435:5 2: ::parse at /home/dev/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.7/src/proto/h1/role.rs:951:28 note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace

Why cannot we simply return an error instead of panicing and making whole thread crash?... I'm using hyper on potentially unsafe input(web crawler) and I cannot just have arbitrary crashes when some data is off

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hyperium/hyper/issues/2083#issuecomment-830641108, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEN7OWSTZJV5YORMURNCWDTLQFLNANCNFSM4J4FSNLQ .

-- Sent from Gmail Mobile

grnmeira commented 1 year ago

I've been having the same issue (but using hudsucker/hyper directly). It'd be really nice if hyper could relax those parsing rules a bit.

And it looks like httparser introduced a PR in May/2022 that actually allows some level of configuration on how to deal with those invalid headers.

https://github.com/seanmonstar/httparse/commit/0e7d95ccda646aa2ce338603f9f97e933079f6c2

It basically introduced a configuration that allows some of these header problems to be ignored. Does hyper expose those configurations somehow? Or can that be exposed?

grnmeira commented 1 year ago

I've been having the same issue (but using hudsucker/hyper directly). It'd be really nice if hyper could relax those parsing rules a bit.

And it looks like httparser introduced a PR in May/2022 that actually allows some level of configuration on how to deal with those invalid headers.

seanmonstar/httparse@0e7d95c

It basically introduced a configuration that allows some of these header problems to be ignored. Does hyper expose those configurations somehow? Or can that be exposed?

Ok, clearly existent issue https://github.com/hyperium/hyper/issues/2879