hyperium / http

Rust HTTP types
Apache License 2.0
1.14k stars 284 forks source link

Allow percent encoding in URI host #528

Open paullgdc opened 2 years ago

paullgdc commented 2 years ago

According to RFC 3986 section 3.2.2 https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.2 , the host section of the authority of an url is allowed to carry percent-encoded characters.

The parsing code only allowed % in ip adresses, and userinfo, which is both needlessly restrictive, and allows some invalid urls if there isn't two hex characters after the percent character.

This PR allows percent encoding everywhere in the host, and checks if the percent-encoding is valid.

This is useful for instance to implements Unix domain sockets URI, that would look like unix://<percent encoded socket path>/<http request path>

robjtede commented 2 years ago

Torn on whether this is a good thing for the crate, here's some facts though:

Given these, and without requiring an up to date public suffix list referenced by the WHATWG standard, it feels like allowing percent encoding as this PR does would be largely standard compliant.

Edit: having some doubts after reading more (research, curl PR, curl mailing list), it seems we'd need to actually do the percent decoding (sometimes?) to prevent this feature turning into a security risk