hyperium / http

Rust HTTP types
Apache License 2.0
1.13k stars 284 forks source link

The hyperium URI parser is stricter than the url crate #342

Open asajeffrey opened 4 years ago

asajeffrey commented 4 years ago

Over in Servo, we've seen cases where URLs that the urlcrate parsed were then rejected by hyper: https://github.com/servo/servo/issues/24175. The case we hit https://github.com/servo/servo/pull/24148#issuecomment-529963631 was the URL http://"empty_script.js/?id=5\\%22, but there are probably other instances.

bluetech commented 4 years ago

http::Uri actually parses HTTP request targets. If I'm reading the RFC correctly, the character " is not a valid character in the authority part of an absolute-form HTTP request target.

Are you sure that it is valid? I tried to check how Firefox would serialize this URL to the HTTP request, but it seems Firefox doesn't consider it to be a valid URL either so couldn't get it to work.

asajeffrey commented 4 years ago

@SimonSapin?

SimonSapin commented 4 years ago

Should we really have a full URL given to Hyper as a single string to parse? I would expect Hyper to accept separately a protocol (as an enum or as separate APIs), a impl std::net::ToSocketAddrs for the purpose of opening a TCP or UDP, socket, and a request-target that goes in the HTTP request. The latter’s most (?) typical form is a path with optional query string.

asajeffrey commented 4 years ago

@SimonSapin the particular case that was problematic here was a referer header, not opening a connection.