Open egonny opened 4 years ago
So, you have a use case where you don't know at the time of constructing the Client
what URLs should be handle by certain proxies, only when actually sending the request?
Correct. It also would make it easier to e.g. use a rotating set of proxies for just one URL, as I'm not sure how this would be best implemented currently.
I dont think this should be added to RequestBuilder because it conflicts with separation of concerns and muddles what the responsibility of a Request or a Client is. The Request should be responsible for specifying the thing which needs to be sent and the Client should be responsible for how requests are sent.
Currently, because RequestBuilder's must be built off of a Client, this makes something like switching proxy at request send time annoying, but with an API that lets you specify the client at the moment of sending, this is much easier. I have a a PR in progress to add this API.
After that interface is available, you could keep a ClientBuilder around which has all of your client configuration set except the proxy, then clone that builder and set the proxy to send the request with.
I would also like to create a PR which allows easy interchangeability between a Request and RequestBuilder and a Client and ClientBuilder, but I have further developing on that idea before its ready for a PR. Once that is in place, you could use one client, and mutate the proxy each time before sending the request.
@707090 I strongly doubt that the separation of concerns you're describing is a driving reason behind this client/request separation in any HTTP library, as default_headers
is already a violation of this separation. Your proposal would make it impossible to efficiently reuse connections, which is the main reason one would want to use Client.get over reqwest::get (the latter also being a mixing of concerns as per your model).
I've found this issue because there are multiple options currently living on teh client that I would like to configure on a per-request level instead:
ClientBuilder.redirect
) -- certain endpoints (any POST) are not supposed to ever redirect in my scenarioClientBuilder.gzip
) -- I have a (fringe) usecase for this where I want to pass through the HTTP response as my ownClientBuilder.referer
) -- same as redirect policy though I can live with disabling it globallyNone of these have anything to do with the connection pool as far as I understand, so I don't see why I'd have to configure more than one client for them. Particularly response decompression is configurable by actix-web's HTTP client on a per-request basis (which I am coming from).
I think this could be achieved without too much code duplication by carrying a requestbuilder within the client that is cloned whenever a request is opened. The setters for builders could be put onto an extension trait but since we probably would not want to break API, it would have to be generated via macros.
Let me know @seanmonstar if you think this makes sense and I'll try to come up with something.
@untitaker for each option:
redirect policy -- certain endpoints (any POST)
I think we could easily include the request method on the redirect::Attempt
, which would allow you to use a per-client policy.
automatic decompression
I suppose you mean you want to receive gzipped content, and on some requests auto-decompress, and on some leave alone?
referer
You said this is like the redirect policy, meaning you wouldn't want to include it with POST requests, or if the redirect policy didn't redirect a POST, that would solve this too?
I think we could easily include the request method on the redirect::Attempt, which would allow you to use a per-client policy.
I personally think this is a less flexible and generally harder to use API than specifying the redirect policy when sending the request, but I have already worked around this in the past by disabling redirects entirely and explicitly handling them per-codepath.
I suppose you mean you want to receive gzipped content, and on some requests auto-decompress, and on some leave alone?
yup, concretely I want to reuse the connection pool in a proxy application that re-interprets and rewrites some requests on some routes and passes through others as-is.
You said this is like the redirect policy, meaning you wouldn't want to include it with POST requests, or if the redirect policy didn't redirect a POST, that would solve this too?
fair, I think so.
it would be extremely useful to be able to configure things like proxy, DNS overrides, TLS client config per request.
Being able to change the proxy config per request would make integrating with proxy services easier. The username field is used to pass request config, it's a pretty standard thing now (https://[username]-[iso country code]-[sessionid]:password@proxy.shop). With HTTP you could presumably manually set the proxy auth header, but I'm not sure any way to add headers to the initial HTTP CONNECT request as it stands.
Without the ability to do DNS overrides or TLS client config to setting the SNI we have to resort to creating a client per request if we want functionality like curl's --resolve.
After that interface is available, you could keep a ClientBuilder around which has all of your client configuration set except the proxy, then clone that builder and set the proxy to send the request with.
This is a good idea. Thx
This is currently a blocker for any type of non-trivial web scraping work where having a large set of rotating proxies is common.
With the current version, it is not very evident to switch between multiple proxies in the connection pool that a
Client
provides. A simple way to give more control over proxy selection, would be to add aproxy()
method onRequestBuilder
. In this case, it would be possible to override whatever default proxy (if any) has been set onClient
.I'm not certain if this is easily implementable, looking at how
Connector
handles proxies, which is built at the same time asClient
. If possible, I'd love to get involved with implementing this.