http-rs / surf

Fast and friendly HTTP client framework for async Rust
https://docs.rs/surf
Apache License 2.0
1.45k stars 119 forks source link

HTTP/1.0 support #287

Open Shnatsel opened 3 years ago

Shnatsel commented 3 years ago

On some websites, e.g. http://thomsonreuters.co.uk, surf fails with the following error:

Unsupported HTTP version

Firefox, curl and ureq (a blocking Rust client) work fine.

11820 websites out of the top million from Feb 3 Tranco list are affected.

Tested using this code. Test tool output from all affected websites: surf-unsupported-http-version.tar.gz

I've only tested the async-h1 backend; I don't know if the other backends are affected.

Fishrock123 commented 3 years ago

async-h1 is a HTTP 1.1 parser.

Opt-in HTTP 1.0 support was recently-ish merged however, but you'll need to set that up yourself: https://github.com/http-rs/async-h1/pull/170

That is server-only, there is no client support presently.

Shnatsel commented 3 years ago

It would be nice to expose this option in surf and accept HTTP 1.0 by default, since literally every other client I've tested seems to do so.

Fishrock123 commented 3 years ago

(Note, this is barely over 1% of websites.)

It's of low importance to newer HTTP clients, but we may have an option to support it depending what implications it has.

d4h0 commented 2 years ago

I've only tested the async-h1 backend; I don't know if the other backends are affected.

At least with the hyper backend, this problem doesn't seem to exist:

$ longboard -c h1 GET http://thomsonreuters.co.uk
Error: Unsupported HTTP version

$ longboard -c hyper GET http://thomsonreuters.co.uk
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
response headers
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
{
    "content-type": "application/octet-stream",
    "server": "BigIP",
    "connection": "Keep-Alive",
    "location": "http://www.thomsonreuters.co.uk/",
    "content-length": "0",
}
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
status
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
301: Moved Permanently
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
response body (application/octet-stream)   <EMPTY>

It's of low importance to newer HTTP clients, but we may have an option to support it depending what implications it has.

To me, it seems like a big problem if 1% of all websites won't work with surf/h1. Basically, there is a 1% chance that I have to rewrite my code and use a different library, if I use async-std and surf (I use tokio and surf/hyper, so it's not a problem for me personally)