hyperium / http

Rust HTTP types
Apache License 2.0
1.12k stars 283 forks source link

Add `Clone` for Request/Response/Extensions #574

Closed LucioFranco closed 8 months ago

LucioFranco commented 1 year ago

Proposal

Allow Request<T>/Respnose<T> to implement Clone if T: Clone. The current blocker for this is that Extensions doesn't implement Clone. This is due to the complicated nature of Any and Clone. Though it is a solvable problem shown by dyn-clone and the anymap (beta releases) crates.

The solution in this PR swaps out our own AnyMap implementation to use the anymap crate 1.0.0-beta.2 version that supports clonable values. The downside to this change is that it nows requires all the types that get inserted into the anymap to implement clone which is not a backwards compatible change and will break end users. To support this use case this PR adds a NotCloneExtension (pending name, open to ideas) that allows users to implement Clone for a type T that doesn't implement Clone. This is done by storing the inner type as an Option and when cloning the newtype it will set the value to None. This allows users to for example, have different extension instances per requests in the retry middleware.

The benefit of Request<T>/Response<T> implementing clone will greatly improve the ergonomics for end users and simplify things like the tower retry layer.

Closes https://github.com/hyperium/http/issues/395

cc @seanmonstar @hawkw @davidbarsky @hlbarber @LukeMathWalker

Notes

This PR uses a forked version of the anymap crate that adds a specialized extend_map fn. The plan I think going forward would be to vendor the minimal set of code from the anymap crate and to not depend on it for 1.0 of http.

LucioFranco commented 1 year ago

Wrap the dyn Any in an Arc instead of Box, forgo the Extensions::get_mut method. We could provide a Extensions::swap instead. Perhaps it's possible to emulate get_mut by doing something akin to Arc::make_mut.

I think it makes more sense to let users add the Arc themselves rather than force it upon them? What if they want to use a Rc?

The Extensions::clone() returns Extensions::new().

This just feels misleading and doesn't support an extension existing through the clones.

LukeMathWalker commented 1 year ago

Is separating (or unbundling, if you prefer) Extensions from Request/Response an option on the design table? Or is that considered to be a no-starter?

LucioFranco commented 1 year ago

@LukeMathWalker I don't think that changes much since we still need a single type to pass through. One option, would be to change tower to accept something like request, context but that doesn't really feel like an improvement over just making extensions cloneable. So I would say removing it doesn't make much sense right now and probably not worth exploring.

LukeMathWalker commented 1 year ago

One option, would be to change tower to accept something like request, context

That's the direction I would have envisioned by unbundling.

davidbarsky commented 1 year ago

I think that making extensions cloneable is a good thing, but I think I agree with @LukeMathWalker. However, I don't think changing Tower to accept (request, context) is the right move, but a wrapper along the lines of Context<Request> could be, where Context can be provided by Tower and acts as a semi-smart pointer.

Before moving on, I wanted to add some notes on my interest in this feature:

In any case, I don't feel too strongly about the existence of extensions inside of http and I think that making extensions cloneable is probably a net win, but if I had a vote, I think exploring a world where Tower a context-style wrapper to requests and response is absolutely warranted.

LucioFranco commented 1 year ago

since I'm not sure request-level retries make sense when streaming bodies are a thing.

Connection level/before streaming body failures are generally a thing you want to retry.

I think it makes overloading downstream services way too easy and users should instead be pushed to use well-tuned retry middleware from libraries like Tower.

The plan is to implement good defaults in the batteries included retry middleware to prevent this stuff. I think no matter what we want to support retries and let downstream consumers decide if they want to enable it or not.

Extensions seems like a use-case better served by a crate aimed at more advanced users, such as Tower, but I can see how the convenience of extensions existing directly on http::Request/http::Response is useful.

I think the ergonomics of it existing with one of the most used types in the ecosystem has a huge benefit. I think generally speaking we've been pretty happy with extensions and I have not seen many users complain about it. I think this extensions style has been successfully adopted by the aws sdk as well.

I don't think doing something like Context<Request> is as ergonomic as just having a Request, especially in the space of something like tower where generics are already complicated. I think overall with our infrastructure we should aim to reduce complexity and embedding extensions imo is one way to get there.

I think another thing we need to think about here is timelines and what makes sense. I don't know if anyone has the time to explore changing up context in tower and the deadline for a http 1.0 is fast approaching. I think we may want to consider keeping things how they are and making a change like this just improves some flexibility while not losing out on the benefits of the previous implementation.

jdisanti commented 1 year ago

In the AWS SDK, the SdkBody that we use as the generic arg for http::Request is conditionally cloneable depending on its source. An SdkBody that wraps a user-provided stream isn't cloneable since the source is gone after its read, but a local file system file where we still have the path is cloneable. Given this, I don't know how much we would be able to take advantage of http::Request being Clone when B: Clone. We need some form of optionality that isn't in the type system, unless we greatly overhaul the SDK to use multiple body types, but that would make it harder to use I think.

I think this extensions style has been successfully adopted by the aws sdk as well.

We use a similar property bag pattern in the SDK, but we ended up wrapping http::Request in our own operation::Request so that we could have cloneable properties as well as add a try_clone method that does some heavy lifting of manually cloning http::Request.

hawkw commented 1 year ago

Regarding moving the extensions out to a separate type...it's worth noting that much of the value from having a notion of "extensions" comes from it being defined in the http crate, so that different crates that might pass around an http::Request/http::Response can propagate the data that other libraries want to associate with that request/response, without having to be aware of those other libraries.

For example, hyper's implementation of protocol upgrades (e.g. CONNECT) relies on setting request/response extensions: https://github.com/hyperium/hyper/blob/96dcc79b62572c591b8062f26db602868de5b5f0/src/upgrade.rs#L304

Because the extensions are part of the http::Request type, hyper can rely on the data it stores for upgrades being propagated by other libraries involved in handling a request, such as tower or web frameworks built on top of hyper. If the solution was that crates like tower-http or web frameworks using hyper should define their own types that store extensions, then hyper would have to be aware of those types...which is untenable for several reasons (e.g. hyper needing only stable dependencies, cyclic deps between web frameworks that depend on hyper and define their own context types, hyper potentially having to implement separate code paths for each dependent's context type, etc).

AFAICT, a model where other libraries provide the context/extension type wrapping a Request kind of defeats the purpose of having a typemap of extensions in the first place, since they can no longer rely on their extensions being propagated across library boundaries when http is the only dependency shared by those libraries. The entire reason we store extensions in a typemap is to erase types so that they can cross crate boundaries without requiring every crate to be aware of a specific extension type. If we decide to give up on having that functionality in http, crates that want to wrap requests could just have typed structs that store their extensions as fields, and we wouldn't need the typemap...but we would be losing the ability to propagate extensions through libraries that are unaware of any given extension.

davidbarsky commented 1 year ago

(i'm little sick, so I apologize if this response isn't particularly coherent or well-edited.)

Connection level/before streaming body failures are generally a thing you want to retry.

Fair enough!

The plan is to implement good defaults in the batteries included retry middleware to prevent this stuff. I think no matter what we want to support retries and let downstream consumers decide if they want to enable it or not.

I had a longer response, but after re-reading this, I'm probably over-indexing on this point and was probably steeped too long in the context of Thrift-based RPC, so I replaced it with this sentence.

I think the ergonomics of it existing with one of the most used types in the ecosystem has a huge benefit. I think generally speaking we've been pretty happy with extensions and I have not seen many users complain about it. I think this extensions style has been successfully adopted by the aws sdk as well.

Sure, I can see that.

I don't think doing something like Context is as ergonomic as just having a Request, especially in the space of something like tower where generics are already complicated. I think overall with our infrastructure we should aim to reduce complexity and embedding extensions imo is one way to get there.

🤷. I think I've had stronger opinions in the past but they're not so strong now. I think most of my complaints boil down to aesthetics and vibes, which isn't really the best basis for making decisions like these.

I think another thing we need to think about here is timelines and what makes sense. I don't know if anyone has the time to explore changing up context in tower and the deadline for a http 1.0 is fast approaching. I think we may want to consider keeping things how they are and making a change like this just improves some flexibility while not losing out on the benefits of the previous implementation.

I haven't really been keeping pace with Hyper's roadmap to 1.0, so I'm assuming it's trying to hit 1.0 this year, which: fair.

For example, hyper's implementation of protocol upgrades (e.g. CONNECT) relies on setting request/response extensions: https://github.com/hyperium/hyper/blob/96dcc79b62572c591b8062f26db602868de5b5f0/src/upgrade.rs#L304

I keep forgetting about Hyper's usage of extensions in upgrades; I think that usecase is sufficiently compelling that it overrides any aesthetic concerns I might have.

The entire reason we store extensions in a typemap is to erase types so that they can cross crate boundaries without requiring every crate to be aware of a specific extension type. If we decide to give up on having that functionality in http, crates that want to wrap requests could just have typed structs that store their extensions as fields, and we wouldn't need the typemap...but we would be losing the ability to propagate extensions through libraries that are unaware of any given extension.

That is a compelling motivation and I think the reason I forgot about this is that I haven't been steeped in writing http-based services (or any services, for that matter) for nearly the last two years.

In any case, I trust y'all will make the right decision and this PR is an improvement to the status quo.

LukeMathWalker commented 1 year ago

Regarding moving the extensions out to a separate type...it's worth noting that much of the value from having a notion of "extensions" comes from it being defined in the http crate, so that different crates that might pass around an http::Request/http::Response can propagate the data that other libraries want to associate with that request/response, without having to be aware of those other libraries.

To clarify: I was not suggesting to remove http::Extension - having a "standard" extension map type that everybody uses is extremely advantageous for all the reasons you enumerated. My suggestion was much more limited: keep http::Extension, but do not store it as a field in http::Request/http::Response.

It would open up a few possibilities for crates that depend on http (e.g. I've often seen users confused when they found out that something they inserted in the extension map of an http::Request doesn't show up in the extensions map for the returned http::Response) but it does come with an impact on ergonomics - you now have to pass two things around instead of one, which can be annoying.

benwis commented 1 year ago

Was wondering if there was any movement on this? Making Request Clone would solve some of our problems for use of the http Request and Response types

benwis commented 8 months ago

I am irrationally excited this might become get merged. @LucioFranco let me know if I can help

seanmonstar commented 8 months ago

This was done in #395