hyperium / http

Rust HTTP types
Apache License 2.0
1.15k stars 285 forks source link

Allow getting just Parts from a Builder #372

Closed jameysharp closed 1 year ago

jameysharp commented 4 years ago

Over in non-binary/http-cache-semantics#2, we're using request::Parts and response::Parts rather than Request or Response because we never care about the bodies. But the builders are still the easiest way to construct those.

So it'd be nice to be able to take a builder and say .parts().unwrap() instead of .body(()).unwrap().into_parts().0. Our callers will likely have full requests and responses so they won't care, this is just nice for situations like our unit tests where we really only need to build the non-body parts.

I think this means adding new methods to the builders along the lines of pub fn parts(self) -> Result<Parts<T>> { self.inner }.

Thoughts?

By the way, thanks for changing the builders to by-value instead of by-reference; that turned out to simplify our code by a lot.

seanmonstar commented 4 years ago

Is there a particular reason to use the Parts in the public API? I'd think that if someone had some Request<SomeBody>, and wanted to check the cache, they'd need to deconstruct it, instead of just passing &req.

jameysharp commented 4 years ago

I was thinking about opening a separate issue for that. Seems to me that Request and Response should have parts and parts_mut methods to match the existing body and body_mut methods?

Or just make both fields of those structs public, at which point basically none of the methods on those types are strictly necessary. I'm not suggesting removing anything, but the existing methods become purely a matter of convenience rather than a necessity.

I'd just prefer not to expose an API that forces the caller to provide information that I don't want and will never use.

seanmonstar commented 4 years ago

The original intention of the Parts structs was to allow for taking ownership of the individual pieces without exposing the exact internal structure. It wasn't meant to be a type used in APIs...

Is the issue that you need to specify a generic for the Request and Response and don't care what it is?

jameysharp commented 4 years ago

Hmm. I guess if all my API entry-points have unconstrained type-parameters for any Request and Response arguments then that's a type-level proof that any type of body is fine, and I guess that's good enough from an API design perspective.

Is the Rust compiler smart enough that if I'm passing in a &Request<T> for an unconstrained T then it doesn't need to monomorphize a separate copy of the code for every T? Given that struct layout is unspecified, I'm not sure how it could be; that optimization would only be safe if you're sure that the non-T fields are always at the same offset regardless of what T is. I'd be sad if having three different type parameters on different arguments led to an exponential explosion of copies of what'd turn out to be identical code in practice. By contrast, passing &Parts obviously doesn't involve monomorphization.

I don't understand what "allow for taking ownership of the individual pieces without exposing the exact internal structure" means. You can transfer ownership of the body separately from the everything-but-the-body, sure, and the internals of Parts remain hidden. What makes that particular boundary useful to you, and why do you expect it to not be useful to anybody else?

seanmonstar commented 4 years ago

Is the Rust compiler smart enough that if I'm passing in a &Request<T> for an unconstrained T then it doesn't need to monomorphize a separate copy of the code for every T?

Hm, probably not. What if you made a pair of traits for the methods you need, and then accepted &dyn Request type arguments? It'd work for any http::Request, and the dynamic dispatch shouldn't be an issue at all.

I don't understand what "allow for taking ownership of the individual pieces without exposing the exact internal structure" means.

I mean that without exposing the public fields of Request and Response, it allows to "deconstruct" the pieces. If there was just fn into_headers(self) -> Headers, into_uri, etc, then you couldn't take all the pieces, only 1.

lf- commented 1 year ago

I am running into a similar issue.

For context, I am writing an HTTP interception system that operates in streaming mode. I want to split the streaming HTTP parsing bit from the "doing anything with the request" piece: the parser emits a stream of events: ReqHeaders(id, parts), ReqBodyChunk(id, data), RespHeaders(id, parts), RespBodyChunk(id, data). Because of how this is structured, there is never a final request value and repeating the headers makes little sense. So the thing that makes sense to me is to send the headers and metadata downstream separately from the request data.

The reason it's like this instead of mutating the request in-place or something is that I would have to have a mega-struct that does everything if I didn't want to have big ownership woes, and this allows for a more reasonable layering.

seanmonstar commented 1 year ago

With the current method taking self, it can construct a Parts if Request were to have a different representation privately.

This likely isn't going to happen.