Open fakeshadow opened 1 year ago
Wouldn't most people allocate again anyways (if its used in a Path
On a further look a zero copy value is also possible in form like this:
struct Param {
key: BytesStr,
value: BytesStr
}
impl Node {
fn at(&self, path: impl Into<ByteStr>) {}
}
This would hit the perf even further when given a str to match with one extra copy and a couple of reference counted slicing on it. But in real world http
can possibly be used to hand out an already existing ByteStr to enable zero copy. Related code: https://github.com/hyperium/http/blob/34dc1ccb4786c8026e57404fa5e71275978a55f3/src/uri/path.rs#L13
I like that Params
are zero-copy, and I haven't noticed usability issues with the lifetimes. In my toy astra/mathcing project I barely every write down the lifetimes: https://github.com/dpc/htmx-sorta/blob/5e9d2fe026ad9e2baae20368e59c7237c0671161/src/routes.rs#L64
The only problem I've noticed is the one described in #40 and describe there a way to delay any allocations to the point where actual path arguments are known to exist. In the (I assume common case) where there is None, no allocations need to happen.I use Box
there, but Arc<_>
, Cow<_>
etc would work fine as well.
http/hyper already uses Bytes internally, though not fully exposed in it's public API, so we might still be able to be zero-copy in the common use case.
It's when you use it in a complicated web framework abstraction the lifetime pollution becomes apparent. This is how matchit is used in axum: https://github.com/tokio-rs/axum/blob/368c3ee08fc3896358d3bd2bfc8cc67f2925c6ef/axum/src/routing/url_params.rs#L24
There are mainly two reason for less practical use of reference based param in this case: Exposing lifetime in public API like Service<Request<'a>>
is harder to use for library consumer and typical type map for runtime type casting require Any which requires 'static lifetime.
In xitca I use a fork of matchit with ref count based string type and inline optimized small string which is both easier to use in complicated abstraction and faster with less allocation.
http/hyper already uses Bytes internally, though not fully exposed in it's public API, so we might still be able to be zero-copy in the common use case.
There is a related issue: https://github.com/hyperium/http/issues/484 though it's unlikely to happen before some form of widely used ref count string type exists (either in std or 3rd party).
I guess the most generic way of supporting this would be to return a Range<usize>
for values instead of a string slice and punting the choice of string type to users, though you would still have to re-allocate a Params
list. For lifetime-less keys though we'd still have to use some sort of Arc<str>
internally.
It's when you use it in a complicated web framework abstraction the lifetime pollution becomes apparent. This is how matchit is used in axum: https://github.com/tokio-rs/axum/blob/368c3ee08fc3896358d3bd2bfc8cc67f2925c6ef/axum/src/routing/url_params.rs#L24
That doesn't look bad. Seems it comes down to "Params
is holding references to stuff and has a lifetime".
Anyway, I agree that Params
is not best for general purpose consumption. My observation with #40 is that can do route matching and get a cheap Params
, and then you convert it to something else and due to the knowledge of what was parsed it might be cheaper (free in many cases) than cloning path/url upfront.
I thought that maybe matchit
should support it internally, but now it occurred to me that it would be already possible to do it externally, if matchit
provided some facilities for it (expose some internals, or some methods to iterate over it). What you actually convert it to does not even have to be one and only one thing. You have your favorite short-string library etc., you can use it.
These two effort could be combined. Matching could support internally some conversions to representations that are more useful for general purpose code at the cost of extra allocations etc, and then also enable developers converting to whatever they want as well.
If hyper
can expose url as Bytes
etc. that would change the situation and make more things possible.
A trade off between less lifetime and reference counted overhead + an extra public type. Current Param(s) type are pretty hard to use if not consumed inline. From what I see most people would do an extra heap allocation to erase the lifetime(s) so in that case this would likely be cheaper.