seanmonstar / reqwest

An easy and powerful Rust HTTP Client
https://docs.rs/reqwest
Apache License 2.0
9.44k stars 1.05k forks source link

Add proxy support to RequestBuilder #804

Open egonny opened 4 years ago

egonny commented 4 years ago

With the current version, it is not very evident to switch between multiple proxies in the connection pool that a Client provides. A simple way to give more control over proxy selection, would be to add a proxy() method on RequestBuilder. In this case, it would be possible to override whatever default proxy (if any) has been set on Client.

I'm not certain if this is easily implementable, looking at how Connector handles proxies, which is built at the same time as Client. If possible, I'd love to get involved with implementing this.

seanmonstar commented 4 years ago

So, you have a use case where you don't know at the time of constructing the Client what URLs should be handle by certain proxies, only when actually sending the request?

egonny commented 4 years ago

Correct. It also would make it easier to e.g. use a rotating set of proxies for just one URL, as I'm not sure how this would be best implemented currently.

707090 commented 3 years ago

I dont think this should be added to RequestBuilder because it conflicts with separation of concerns and muddles what the responsibility of a Request or a Client is. The Request should be responsible for specifying the thing which needs to be sent and the Client should be responsible for how requests are sent.

Currently, because RequestBuilder's must be built off of a Client, this makes something like switching proxy at request send time annoying, but with an API that lets you specify the client at the moment of sending, this is much easier. I have a a PR in progress to add this API.

After that interface is available, you could keep a ClientBuilder around which has all of your client configuration set except the proxy, then clone that builder and set the proxy to send the request with.

I would also like to create a PR which allows easy interchangeability between a Request and RequestBuilder and a Client and ClientBuilder, but I have further developing on that idea before its ready for a PR. Once that is in place, you could use one client, and mutate the proxy each time before sending the request.

untitaker commented 3 years ago

@707090 I strongly doubt that the separation of concerns you're describing is a driving reason behind this client/request separation in any HTTP library, as default_headers is already a violation of this separation. Your proposal would make it impossible to efficiently reuse connections, which is the main reason one would want to use Client.get over reqwest::get (the latter also being a mixing of concerns as per your model).

I've found this issue because there are multiple options currently living on teh client that I would like to configure on a per-request level instead:

None of these have anything to do with the connection pool as far as I understand, so I don't see why I'd have to configure more than one client for them. Particularly response decompression is configurable by actix-web's HTTP client on a per-request basis (which I am coming from).

I think this could be achieved without too much code duplication by carrying a requestbuilder within the client that is cloned whenever a request is opened. The setters for builders could be put onto an extension trait but since we probably would not want to break API, it would have to be generated via macros.

Let me know @seanmonstar if you think this makes sense and I'll try to come up with something.

seanmonstar commented 3 years ago

@untitaker for each option:

redirect policy -- certain endpoints (any POST)

I think we could easily include the request method on the redirect::Attempt, which would allow you to use a per-client policy.

automatic decompression

I suppose you mean you want to receive gzipped content, and on some requests auto-decompress, and on some leave alone?

referer

You said this is like the redirect policy, meaning you wouldn't want to include it with POST requests, or if the redirect policy didn't redirect a POST, that would solve this too?

untitaker commented 3 years ago

I think we could easily include the request method on the redirect::Attempt, which would allow you to use a per-client policy.

I personally think this is a less flexible and generally harder to use API than specifying the redirect policy when sending the request, but I have already worked around this in the past by disabling redirects entirely and explicitly handling them per-codepath.

I suppose you mean you want to receive gzipped content, and on some requests auto-decompress, and on some leave alone?

yup, concretely I want to reuse the connection pool in a proxy application that re-interprets and rewrites some requests on some routes and passes through others as-is.

You said this is like the redirect policy, meaning you wouldn't want to include it with POST requests, or if the redirect policy didn't redirect a POST, that would solve this too?

fair, I think so.

ip-rw commented 2 years ago

it would be extremely useful to be able to configure things like proxy, DNS overrides, TLS client config per request.

Being able to change the proxy config per request would make integrating with proxy services easier. The username field is used to pass request config, it's a pretty standard thing now (https://[username]-[iso country code]-[sessionid]:password@proxy.shop). With HTTP you could presumably manually set the proxy auth header, but I'm not sure any way to add headers to the initial HTTP CONNECT request as it stands.

Without the ability to do DNS overrides or TLS client config to setting the SNI we have to resort to creating a client per request if we want functionality like curl's --resolve.

After that interface is available, you could keep a ClientBuilder around which has all of your client configuration set except the proxy, then clone that builder and set the proxy to send the request with.

This is a good idea. Thx

Rapptz commented 1 year ago

This is currently a blocker for any type of non-trivial web scraping work where having a large set of rotating proxies is common.