Open sosthene-nitrokey opened 6 days ago
@sosthene-nitrokey Thank you so much for taking the time to test this thoroughly. This is such great feedback that I can guarantee we need to address most of it and make an rc3.
I have some personal health issues right now preventing me from deeply focusing on coding. I welcome all help with addressing these concerns via PRs.
Config fields are private
Agree we need to address this. The main branch only did a builder, there config fields themselves are pub(crate)
. For semver's sake, I think we should add accessor functions on Config
for the properties, not expose them directly.
There would be benefits to not boxing the connectors
I like where this is going. It never felt great to box everything up the way it is now. I'd also been thinking one could explore a completely generics-based chain that is only abstracted/boxed on the top level, but that might be a bit of a hairier change.
The upstream connectors and transports are not public
Yeah, ureq 2.x suffered badly from overexposing stuff that just had to break semver (like re-exporting other crates). Thus I been very reluctant to make all of the internal workings public, thinking that if someone needs the TcpConnector for their local needs, they can just copy-paste the code.
In rc1 and still in rc2, I allowed myself exceptions to semver. Maybe it's a terrible idea, but we could potentially "carve out" exceptions to API stability by clearly marking it as such. Maybe the entire transport
module can be considered unstable at this point even if we proceed making the rest of the 3.0.0 release?
The typestate variants are not public
When I started building 3.x, my entry point was that "The http-crate is the public API". RequestBuilder
is therefore in my mind an enhancement to the http-crate request::Builder
- it even derefs to that. At one point I was even considering making these enhancements as extension traits.
It might be that I need to shift my thinking here. I would be open to a PR exporting these typestate variants under #[doc(hidden)]
since I don't want them cluttering the API at the same time as not disallowing the use case you mention.
If you want I can create dedicated issues (or even PRs) for the parts you think are worth it.
Again, thank you so much for taking the time. I will not be able to look into these things myself right now, but I do promise to get to them eventually. I can however review PRs, so I welcome any help you can spare!
Yeah, ureq 2.x suffered badly from overexposing stuff that just had to break semver (like re-exporting other crates). Thus I been very reluctant to make all of the internal workings public, thinking that if someone needs the TcpConnector for their local needs, they can just copy-paste the code.
I fully understand that concern, we are dealing with similar issues in our own libraries, which is why I suggested having a sibling crate (ureq-utils
, ureq-io
?) that would contain these implementation, but would not be re-exported by the main ureq
crate (only using it without exposing it through the default connector parameters) have more frequent breaking change updates. The issue would be that the Connector
trait would also have to be in its own crate
that is as stable as ureq
to avoid circular dependencies.
In rc1 and still in rc2, I allowed myself exceptions to semver.
I don't know if I would call this an exception to semver. Saying that a specific usage is a bug and its behaviour can change is something that is widely understood as being totally compatible with semver (as long as the types don't change and it's properly documented).
Maybe the entire transport module can be considered unstable at this point even if we proceed making the rest of the 3.0.0 release?
I don't like this idea. Splitting it out in its own crate would be make more sense.
Worst case, it could also be possible (even with the current design), to just copy-paste the connectors in another crate, which could even be done by a third party that needs it (we would probably go this way).
When I started building 3.x, my entry point was that "The http-crate is the public API". RequestBuilder is therefore in my mind an enhancement to the http-crate request::Builder - it even derefs to that. At one point I was even considering making these enhancements as extension traits.
There is a bit of a semver hazard in using a Deref
implementation like that no? It could be very possible that the http
crate end up adding a query
or query_pairs
method, which could lead to some confusing error messages no? I also notice that that query
does not actually modify the request builder it defers to, this might be confusing to some users.
But overall I agree that converging on http
as the API is a good idea.
Thank for being open to the feedback!
I don't like this idea. Splitting it out in its own crate would be make more sense.
I just recalled that rustls used to have a similar idea, so there's at least some precedence, granted they are still on semver 0.x.
My trouble with another crate is that I've been on the war path against increasing the dep count. In my view, maybe the main differentiating feature of ureq vs reqwest/static, is dep count.
To reduce dep count, I've been considering rolling hoot
back into ureq. Maybe a way forward is to rename hoot
-> ureq-proto
and roll in the Transport
definitions into that crate?
There is a bit of a semver hazard in using a Deref implementation like that no? It could be very possible that the http crate end up adding a query or query_pairs method, which could lead to some confusing error messages no?
True. I hadn't thought about the risk of them hijacking one of our names. If we do not Deref, we need to copy in like 7 fns we currently get via the Deref:
If the http crate adds a way to mutate the query parameters, we probably still need to bump major, since I prefer to delegate to their way of doing it instead of maintaining parallel ways.
I also notice that that query does not actually modify the request builder it defers to, this might be confusing to some users.
This is because I can't rebuild the query part of the uri without incurring the cost of allocation - something we try to avoid. Not ideal though.
I do not understand why there is such a widespread concern about dep count in the Rust ecosystem. IHMO the real benchmark is:
Separating a single crate into multiple sub-crates does not change any of these metrics (especially if the crates live in the same git repo, even if they have different release cadence). And regarding the overall issue, I think it is worse to have multiple implementation of the same thing through copy-pasted code than having one more dependencies that implements a common pattern an is reused multiple times.
In our specific case, the reason we use ureq
instead of reqwest
is not related to minimizing dependencies, but it's related to the fact that tokio
automatically spawns background threads and thus deals poorly with calling `fork.
I do not understand why there is such a widespread concern about dep count in the Rust ecosystem.
I hear you, but I see it slightly differently.
Maybe a way forward is to rename
hoot
->ureq-proto
and roll in theTransport
definitions into that crate?
Is this an OK compromise?
In our specific case, the reason we use ureq instead of reqwest is not related to minimizing dependencies, but it's related to the fact that tokio automatically spawns background threads and thus deals poorly with calling `fork.
Very interesting! Maybe this should be brought out in the docs? It's also very good input, because there been talks about "happy eyeballs" and other things that might require background threads to be spawned.
Is this an OK compromise?
Yes, that looks good!
Very interesting! Maybe this should be brought out in the docs?
What would that look like? Something like "ureq
does not spawn background threads, and if it ever does in the future, it will be done behind a feature flag, so this can be relied upon if default-features
is turned off." seems the least intrusive but also giving strong guarantees.
It's also very good input, because there been talks about "happy eyeballs" and other things that might require background threads to be spawned.
Would multithreading be the best implementation? the happy-eyeballs
crate does not use threads and uses non-blocking sockets instead. BTW with the transport abstraction this also means it can be implemented by downstreams instead of requiring an implementation in ureq
proper! At least for prototyping/testing it's interesting.
Is this an OK compromise?
Yes, that looks good!
Cool. I'll rename the hoot crate today. ureq-proto
. Then we can make a separate PR moving the transport definitions there.
What would that look like? Something like "
ureq
does not spawn background threads, and if it ever does in the future, it will be done behind a feature flag, so this can be relied upon ifdefault-features
is turned off." seems the least intrusive but also giving strong guarantees.
Yeah, that sounds almost perfect. I wouldn't even hide it behind feature flags, but say that the default configuration would never spawn background threads. The configuration options that will cause threads are clearly documented as such. This happens already today, but it's not brought out enough in the doc.
https://github.com/algesten/ureq/blob/main/src/resolver.rs#L147
Would multithreading be the best implementation? the
happy-eyeballs
crate does not use threads and uses non-blocking sockets instead.
Yes, with async everything is possible.
BTW with the transport abstraction this also means it can be implemented by downstreams instead of requiring an implementation in
ureq
proper! At least for prototyping/testing it's interesting.
This is an important reason I decided to rewrite to ureq 3. People come with PRs and suggestions that are really helpful, but I don't necessarily want to support them all myself. A ureq-happyballs
crate could be maintained by someone else.
Hi,
We are using ureq in our PKCS#11 module, thank you for this reliable and simple library.
We recently had to fork ureq in our use to implement missing functionality we wanted to have, mainly TCP keepalive and automatically discarding old connections.
We hope the 3.x.x branch allowing separate "connectors" and "transport" and the fact that it already supports a max age for idle connections would permit us to migrate back to upstream.
I attempted to port our implementation to the 3.0.0-rc2, here are the challenges I encountered:
Config fields are private
This prevents implementing some of the functionality of the upstream
TcpConnector
in our own implementation (for example there is no way to read the nodelay configuration of the config, preventing of implementing the same functionality in other crates. Looking at themain
branch, it looks like this has been fixed, so this may be outdated.There would be benefits to not boxing the connectors
In the connector traits, the transports are boxed, preventing chained connectors to make use of specificities of the previous transport. Ideally for example, it would be possible to use the specific type of the previous transport. For example a TCP keepalive connector could just wrap or come after a TcpTransport, get its TCP stream, and enable Keepalive on it.
The way I would do it would use associated types for the transports, and make
chained
and not chained connectors part of the trait.With an implementation on tuples of many sizes through a macro:
You could still keep the
ChainedConnector
for oneConnector
+ a chain ofMiddleConnector<AnyTransport>
that could also be boxed.For connectors that may or may not "chain" like the SocksConnector followed by the TcpConnector, I would instead implement the
SocksConnector
as a:This type safety would allow for example a keepalive connector, which is much simpler that re-implementing all the functionality of the TcpConnector just to add that. It makes connectors much more composable.
The upstream connectors and transports are not public
This goes with the previous point. Building a custom chain of connectors could be simpler if the upstream connectors were public. I understand that doing so would be a semver hazard, so maybe they should be in a separate crate (
ureq-utils
maybe?), to allow building on top of them, but allow the connectors to be more frequently updated.The typestate variants are not public
The types
WithoutBody
andWithBody
are not public (same for theConfig
scopes), which means it's not possible to write function that expect or return aRequestBuilder<WithoutBody>
.If you do not want these types to be public for semver reasons, would it be possible to still expose a type alias:
type RequestWithBody = RequestBuilder<WithBody>
,andtype RequestWithoutBody = RequestBuilder<WithoutBody>
, so that they be used in functions signatures?Thank you for the work on 3.0.0, this looks like it will certainly help ureq be more flexible!
If you want I can create dedicated issues (or even PRs) for the parts you think are worth it.