seanmonstar / httparse

A push parser for the HTTP 1.x protocol in Rust.
https://docs.rs/httparse
Apache License 2.0
573 stars 113 forks source link

Request header parsing fails when the horizontal tab is in the header field value #39

Closed tkrs closed 6 years ago

tkrs commented 6 years ago

According to RFC7230, header field values seem to be able to include horizontal tabs.

seanmonstar commented 6 years ago

Yikes, you're right! Do you happen to have an example request that failed to parse because of this?

tkrs commented 6 years ago

Here is an example server running on the current master of Hyper:

hyper:master λ cargo run --example hello
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/examples/hello`
Listening on http://127.0.0.1:3000

And I requested to it as follows:

hyper:master λ ipython
Python 3.6.5 (default, Jun 13 2018, 10:23:14)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import requests

In [2]:  requests.get('http://127.0.0.1:3000/hello', headers={ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2741.6823 Safari/537.36' })
Out[2]: <Response [200]>

In [3]:  requests.get('http://127.0.0.1:3000/hello', headers={ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1)\tAppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2741.6823 Safari/537.36' })
Out[3]: <Response [400]>
seanmonstar commented 6 years ago

What actual user agent produces requests like that (genuinely curious)?

tkrs commented 6 years ago

Yes, the product I participated in received the User-Agent. However, It might well be that request from a bot...

seanmonstar commented 6 years ago

It turns out it was a bug in valid header value map, and was fixed in the SIMD PR. I've backported that tiny fix, with a test, and published it as 1.2.5!