hyperium / http

Rust HTTP types
Apache License 2.0
1.16k stars 291 forks source link

Interoperability with servo URL? #396

Closed kornelski closed 4 years ago

kornelski commented 4 years ago

When writing a library that needs to abstract over URLs I've ran into the problem that the Rust ecosystem has two incompatible URL types: http::Uri and url::Url.

Unfortunately the API choices both crates have made make them incompatible:

To be honest, Servo's implementation seems better to me. It simply stores the entire URL with just one heap allocation, instead of up to three. It can use 32-bit offsets instead of usize length + vtable + atomicptr for each part (Bytes). And it's very convenient to have as_str().

I think Uri could switch to that design if it added lifetimes on Scheme<'_>/Authority<'_>/PathAndQuery<'_>, so these could work similarly to Cow<'static, str> and either own their string or borrow from their parent Uri's single allocation.

With a bit of unsafety, I think this might even be done without breaking API changes by hiding the lifetime. Since Uri exposes only shared/immutable references to these objects, they won't outlive their owning Uri anyway.

Once Uri has .as_str(), Servo's URL could switch to being just a wrapper around the Uri, with Deref and From operations for Uri at zero cost, nicely unifying the types.

What do you think?

seanmonstar commented 4 years ago

Servo's implementation seems better to me. It simply stores the entire URL with just one heap allocation, instead of up to three.

The difference in implementation was carefully chosen. The reason http::Uri has 3 different pieces is because in HTTP2, it comes in 3 separate headers (:scheme, :authority, and :path), which means that in order to create a single contiguous string, we'd need to allocate a new String and copy (and :authority and especially :path can be quite long).

As for the HTTP1 case, where it is received in a single contiguous block, the usage of Bytes internally means it doesn't actually need to make 3 different allocations, but instead is just 3 ref-counted slices to the same buffer.