haskell / network-uri

URI manipulation facilities
Other
25 stars 33 forks source link

Clarification: Should the strings in a `URI` type from this package be considered *already escaped* or not? #39

Closed ryantrinkle closed 6 years ago

ryantrinkle commented 6 years ago

When manipulating URIs, it's important to know whose responsibility it is to call encodeURI, decodeURI, and friends. I'm not clear on whether the responsibility to do URI escaping falls on the person creating the Network.URI.URI, or the person rendering it. For example, does URI "http:" Nothing "/ø" "" "" construct an invalid instance of URI, or should whoever's rendering it be responsible for producing "http:/%C3%B8"?

ezrakilty commented 6 years ago

Yes, this is a tricky area. The RFC is clear on this point, with a whole section on it: https://tools.ietf.org/html/rfc3986#section-2.4

There is also this statement in the "Reserved Characters" section: "If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed." https://tools.ietf.org/html/rfc3986#section-2.2

The Haskell package docs allude to this, although it's a bit subtle: https://hackage.haskell.org/package/network-uri-2.6.1.0/docs/Network-URI.html#g:6

Thinking about it now, I'd say that fact should be advertised more centrally, for example in the description of the data constructor.

In your example, URI "http:" Nothing "/ø" "" "" would indeed produce an invalid URI.