Closed LucioFranco closed 1 year ago
Some thoughts from an interested user:
Given new requirement on extensions, a breaking change then?
You would apply the same to Response
?
It sounds like you would impl Clone where B: Clone
so its kind of optional for bodies, but over time there might be more code expecting this. An aspect that gives me pause is that hyper's default Body
implements Stream
(and is not Clone
yet) and http_body::Body
defines a Stream
-like extension interface. How should implementations handle cloning a Stream
-type after it has been polled? Do cloned Stream
s start from the beginning (if that's even possible) or from the point they are cloned?
With synchronous corollaries, we would typically not clone an Iterator
, but clone the thing that can provide a new iterator, e.g. Vec
or any impl IntoIterator
. In other words, adding a Clone
requirement suggests that maybe, in a breaking release, we should also change to requiring a Body
implement IntoStream
? (Note unfortunately there is already something called IntoStream
, for an entirely different purpose, in futures 0.3.)
With clone in play, the HeaderMap
s in Request
and Response
might benefit from Arc
or Cow
?
Would the next breaking release also be time to consider integrating http_body
into this crate? Or is that considered out of scope? (As an observer I've hoped the second crate was a temporary solution.)
So I think the real goal here should be to make Parts
cloneable and then since bodies are generic let bodies implement clone if they can. For streams, it should be possible to detect if the request failed before the stream was extracted to retry it then. The big issue for me is making Request a bit easier to deal with and extensions seem to always be the big blocker here.
I've shared my opinion on extensions
previously and it still hasn't changed. Short version: I think should be removed. Long version:
http::Request
:
Parts
, which carry the protocol version, method, URI, and headers. These are trivially cloneable relative to the next two. I wouldn't even try to place that data in an Arc, as it might be cheaper to perform a memcpy
on smaller bits of data than bumping a refcount on an Arc
.Extensions
, which is a typemap that might not contain cloneable data. @LucioFranco proposes that all types that are stored in a typemap should implement Clone
, but I'm not sure that's the right approach. Instead, I'd like to remove extensions wholesale and replace their use-cases with task and thread locals for handling implicit context. I believe that thread/task locals (especially thread/task locals that propagate to child tasks using structured concurrency) solve the problem that http
's extensions try to tackle in a more elegant and general-purpose way. I'm aware of some specialized use-cases where task/thread locals wouldn't be a good fit, but I don't believe costs should be borne by all users of http
. Critically, I think cloning an http::Request
is a far more common use-case than placing data into extensions.Body
: for streaming bodies, it's not at clear how bodies should be cloned in a general-purpose way. We shouldn't even try to handle that here!@davidbarsky the issue is that task/thread-locals don't work with most channel backed offtask clients like hyper and tower-buffer. While ext do work with this setup. They couple data that is specific to the user within the request.
or streaming bodies, it's not at clear how bodies should be cloned in a general-purpose way. We shouldn't even try to handle that here!
Agreed, but there are special cases like gRPC where we have total control over this and could probably pull of some sneak stuff related to this. Having parts be clonable would be a huge help here.
I wouldn't even try to place that data in an Arc, as it might be cheaper to perform a memcpy on smaller bits of data than bumping a refcount on an Arc.
I mean its your choice how to implement clone, if memcpy is cheap then do that if arc bump is cheap then do that?
The issue is that task/thread-locals don't work with most channel backed offtask clients like hyper and tower-buffer. While ext do work with this setup. They couple data that is specific to the user within the request.
Sure. In those cases, I think the appropriate course of action is to implement an extension trait that performs a tracing-subscriber
-style lookup for the request context. That being said, I believe that most uses of extensions
are used to change control flow prior to the request reaching hyper::Client
or tower-buffer
.
Agreed, but there are special cases like gRPC where we have total control over this and could probably pull of some sneak stuff related to this. Having parts be clonable would be a huge help here.
My argument boils down that we don't need to settle for just making Parts
cloneable; we can make the entire http::Request
cloneable by moving extensions
out of http::Request
.
I'm not convinced that task locals are an acceptable substitute for extensions
. Linkerd makes makes use of extensions in a lot of places, and although I'll have to take a closer look, I think there are a lot of use-cases that can't be replaced with task-local storage.
FWIW, when Linkerd's retry code "clones" requests, we just clone the parts + body, and create new extensions. A lot of the data we store in extensions is specific to an individual instance of a request — e.g. we want to record latencies for retries separately.
@hawkw that seems like it could easily be done in a clone where you basically clear the data :+1:
@hawkw that seems like it could easily be done in a clone where you basically clear the data đź‘Ť
Agreed, maybe roughly implement like it does in http-types
.
A few options here:
Clone
bound when adding items to Extensions
today (but don't make the request cloneable). This would allow us to take advantage of this fact in the future without having to decide about request clonability. We could always remove the clone bound in the future without causing breaking changes.
pub fn insert<T: Send + Sync + Clone + 'static>(&mut self, val: T) -> Option<T> {
E
or Ext
to request: struct Request<B, E = ()>
— Then you can do something like pub type ExtendableRequest<B> = Request<B, Extensions>
—if you don't use Extensions
at all, your request can be clone if the body is clone (or at least Parts
can be!). If you want to use extensions you can—you can even bring-your-own CloneableExtensions
to have your own extension behavior.3 would be my personal recommendation–very flexible going forward and allows users who have no desire to have request extensions at all (most customers?) to have a cloneable request without worrying about it. We can also do a minimal 3
where we make the default Extensions
so the code as is remains identical but it allows future evolution and also enables folks to get a cloneable request by putting a Clone
type in there themselves.
- Add the
Clone
bound when adding items toExtensions
today (but don't make the request cloneable).
I think we can implement Clone
without making a breaking change at all by keeping the existing insert
method as-is (whose value would be discarded when cloning the map) and (someday) add another version of insert
that takes an impl Clone
value (example implementation on Rust Playground).
I've been planning to write up some thoughts, summarizing this thread next week, so we can arrive at a decision. One thing I think is missing is analyzing what extensions tend to be used for. That would help with deciding if they can be clone, if they can't what should they be instead, etc.
For example, in hyper the OnUpgrade
extension is not cloneable at the moment. I suppose it could be if we made it an Arc Mutex inside, or something similar (it's currently a oneshot Receiver). Whereas ReasonPhrase
is just a string, and could be easily cloned.
I imagine people likely stick loaded context into them, such as a User that was loaded from the DB, or even a DB handle. Are those clone?
There's likely good examples on linkerd2, and in the AWS SDK, and others. Such an analysis would help in deciding "should we".
I'm sure there's others, but do @hawkw @rcoh @robjtede have any comments on whether extensions you use could or could not be Clone
?
We exclusively use Clone
extensions. We are currently adding an HTTP wrapper layer and extensions are gated on being clone. Honestly, we could probably even avoid using extensions all together if we had to.
I'm mainly using them with axum's Extension
extractor: https://docs.rs/axum/latest/axum/struct.Extension.html
In order for this to work (by using it as a tower layer), the value put into the extension needs to implement Clone
in any case. I think that holds true for many other webframeworks as well.
Still, requiring Clone
is a pretty big change. A crater run like the rust project can do it would be pretty useful to get a better picture. Maybe it's possible to look at some popular crates depending on http
at least.
I'm sure there's others, but do @hawkw @rcoh @robjtede have any comments on whether extensions you use could or could not be
Clone
?
The linkerd
proxy doesn't currently use any request extensions that can't implement Clone
, besides hyper's OnUpgrade
. However, many of the extensions we use would not necessarily be correct to clone. For example, we use extensions for tracking the lifetime of a single instance of a request, which should be started over with a new extension if the request is retried. So, we would likely continue to implement retries by creating a new request with a fresh extension map, rather than cloning the request. I don't think requiring Clone
would break anything for us, though.
I think that making an E=Extensions
default generic on Request
and Parts
here keeps the option for folks to have non clone extensions if they absolutely need them without adding any complexity for most customers. e.g. you could have ClearOnCloneExtensions
to get the behavior @hawkw
described above.
I mean, we could also just "not write code that clones requests" to get that behavior. :)
extensions we use would not necessarily be correct to clone.
That seems like something that can't quite be defined as Clone
or not. You want that layer to not reuse the extension, but if some far inner service were to clone or do anything resembling an inner retry, that would be fine.
making an E=Extensions default generic on Request
A potential problem is that everything using http::Request<B>
would need to remember to also be generic over E
, or else they would essentially require the default impl.
extensions we use would not necessarily be correct to clone.
That seems like something that can't quite be defined as
Clone
or not. You want that layer to not reuse the extension, but if some far inner service were to clone or do anything resembling an inner retry, that would be fine.
Yeah, that's right. I only brought this up to say "we probably would not actually use a Clone
implementation for requests, but it wouldn't break stuff we're doing (besides hyper's OnUpgrade
)".
I think making extensions generic is going to make writing tower layers in tower-http much more painful than the benefit.
@hawkw can you implement that new request logic in terms of a non clonable wrapper? Like you can wrap your extension in a way that it reconstructs on clone rather than reusing?
Context: I was a heavy user of Extensions at a Cloudflare where I wrote HTTP and gRPC backends. I no longer have any active projects that use Extensions as my current workplace uses Dropshot.
All the items we put into Extensions
for requests are clone, so it should be fine.
On the other hand, some items we put into response extensions are !Clone, but as far as I understand that is not being proposed.
The Clone would apply to both Request and Response.
At least for us, the main reason we want clone is to easily recreate Parts
during retries without gymnastics. For us anyway, a minimal change like adding .clone_and_clear_extensions()
on Parts
would pretty much solve it for us. I guess for tests and things folks want to actually clone entire requests?
Actix Web doesn't use any of the higher-level constructs from this crate, including Extensions
, and requiring Clone
on the items in this container would not change that. Therefore, my view is not applicable to this particular issue.
The Clone would apply to both Request and Response.
Ok. I can always Arc the response extension types.
+1 for aclone_and_clear_extensions
or similar, as it would also help us create much easier retries.
Alright, it looks like the consensus is to move forward and require Clone
on extension types.
I was curious if we could use std::any::provide_any
etc to be able to query a Box<dyn Any>
for a dyn Clone
, and then we could conditionally clone all extensions that were cloneable without requiring it. I'm not sure that would even work. But I just learned that provide_any
is being de-scoped to just Error::request_ref
. It's hinted that provide_any
can be done outside of the standard library via the dyno
crate, but I haven't tried to do what I said.
Oh well.
PR is up: https://github.com/hyperium/http/pull/634
Will merge quickly, in case you want to chime in, since target launch is Nov 15.
I propose that we move forward to allow
Request<B: Clone>
toimpl Clone
. The big blocker here is thatextensions
is not clonable. This could be fixed by making a AnyClone trait:The downside here is that all things you store in extensions is cloneable but I think this is a fair trade off since mostly its just information that could be thrown behind an arc or bytes.
cc @seanmonstar @carllerche @hawkw @davidbarsky