lo48576 / iri-string

String types for URIs/IRIs.
Apache License 2.0
15 stars 3 forks source link

Support popular schemes. #34

Open damooo opened 1 year ago

damooo commented 1 year ago

It will be useful if popular schemes and their intrinsics are supported out of the box.

Most notably, any constants related to them, and their respective Scheme based uri normalizations.

Http, for example has precisely defined normal form as in 4.2.3. http(s) Normalization and Comparison.

To properly implement them, one has to recreate entire infrastructure again. It may be easier to support them out of the box instead.

lo48576 commented 1 year ago

The reason it is currently unimplemented (while once I've thought about them) is, (1.) normalization algorithms can be implemented on normal strings (and converted to dedicated types once the edit is done), (2.) apps and libs for specific purpose may provide its own type (and extra methods) for URIs rather than generic any-protocol-accepting string types, and (3.) when apps need URI normalization, there would be extra requirements I can't imagine for now.

About (1.), iri-string already provides URI decomposition, so this will already be useful to implement normalization at downstream crates.

About (2.), I think the core value of iri-string crate is providing types in string-compatible manner, so I'm not willing to support non-generic use cases too much (at least for now). And, for exmaple, apps interacting over HTTP/S and websockets will use url crate. Protocol-specific apps and features would have its own URI handling requirements, so they will implement its own types. Usually one wouldn't use IriString for e-mail address, they will define their own EmailAddress type.

About (3.), simply I don't know too much about real-world URI normalization examples. HTTP/S is OK, but I don't know almost nothing about websocket, and... what others to support? ipfs, urn, uuid, tag, tel, ... I don't know what kind of modification they (and other protocols I don't know yet) need, so I should start from investigating real world examples. And if iri-string would only support few of them (HTTP/S and its variant), then it would be better if it's separately provided by HTTP-specific crates.

However, providing some kind of generic and customizable comparison and normalization will be useful, so I'll think what I can do. For example, making it possible to apply only case normalization or only percent-encoding normalization will be relatively easy.

lo48576 commented 1 year ago

Additional note: If RFC 9110 (HTTP Semantics) and WHATWG URL Standard differ on URI handling, what should happen? (I haven't checked if they really differ, just an idea).