seanmonstar / httparse

A push parser for the HTTP 1.x protocol in Rust.
https://docs.rs/httparse
Apache License 2.0
573 stars 113 forks source link

Allowed characters in path #47

Closed theikkila closed 5 years ago

theikkila commented 5 years ago

We're using actix-web library in our project and it uses this library for HTTP-parsing. We are in a situation where we need to accept requests with non RFC2396 characters (like caret ^) in query parameters. Urls are considered human interface and humans shouldn't be expected to handle urlencoding in these situations. We checked other parser implementations from Nginx (https://github.com/nginx/nginx/blob/master/src/http/ngx_http_parse.c#L13) and Node.js (https://github.com/nodejs/http-parser/blob/master/http_parser.c#L187) and they seem to be more liberal by allowing more characters than httparse imlementation.

Should we find out some workaround or would same kind of implementation be in the scope of httparse?

seanmonstar commented 5 years ago

It'd help to see some examples of URLs that should be passing (so the unit tests can include them).

theikkila commented 5 years ago

I have included fix and minimal tests:

49

There is also larger test suite for various urls to following PR:

50

Testsuite have been generated from whatgw specification tests: https://url.spec.whatwg.org/#query-state

seanmonstar commented 5 years ago

With #49 merged, this is fixed!