jelmer / dulwich

Pure-Python Git implementation
https://www.dulwich.io/
Other
2.06k stars 395 forks source link

Add support for no_proxy environment variable #932

Open epopcop opened 2 years ago

epopcop commented 2 years ago

I'm using dulwich in a special environment where the proxy settings are out of my control. One of the features there supports "whitelisting" a set of addresses from the proxy, and these are written to the "no_proxy" environment variable, which is not standardized, but widely used, for example by urllib or requests, but not by urllib3.

Will it be possible to add this check? I will try doing so myself and will submit a PR if succeeded. The main issue I see now is that default_urllib3_manager() doesn't get the repo url or domain. Can I get it from the config parameter?

Thank you!

jelmer commented 2 years ago

Is this something git itself supports somehow?

jelmer commented 2 years ago

I'd be open to supporting git config variables that aren't currently supported.

Alternatively, you could construct a custom implementation of urllib3 manager that has the behaviour you need?

epopcop commented 2 years ago

I'm not that good at c - but I believe git takes the variable into consideration here Looks like it's using curl to make the calls - and it's passing the no_proxy variable to curl behind the scenes.

I actually have a couple of ways to work around this because I barely use porcelain, so I think of overriding HttpGitClient in client.py with a custom client that checks the environment variables and passes the right pool manager. This is similar to how the other tools do it. There are ups and downs in this, but I do want to see that in dulwich eventually (if possible of course).

epopcop commented 2 years ago

As stated in urllib3/urllib3#1785 - urllib3 will not support proxy environment variables, and higher API clients should support it instead (Can't really understand why.. urllib and urllib2 do respect env variables)

I didn't want to do that at first, But I think I'll go with the alternative solution you suggested, but instead of using urllib3 I'll use requests. It gives me all the options I need, as well as better support in my environment. It might not be the best for dulwich, but I'll be honest, I think you should consider migrating to requests as the default HTTP client too. You won't have to handle redirects, pool or proxy managers, and encodings, and it's much easier to maintain, but that's a personal perspective.

I will make a PR with a requests vendor to dulwich.contrib to help others who might encounter this.

jelmer commented 2 years ago

requests is higher level, and doesn't allow as much control over behaviour. This makes it harder to follow the behaviour of C Git. Providing a custom pool manager also potentially gives you a lot of control and flexibility over how HTTP requests are handled. Migrating would also mean changing arguments which make break other users (since the pool_manager argument would have to be dropped), and adds extra dependencies.

urllib3 does follow redirects and can deal with encoding FWIW. AFAICT the only thing that requests provides here is support for proxy environment variables, and the only one of those that (I think?) isn't supported by dulwich today is no_proxy.

epopcop commented 2 years ago

I guess you're right, I probably don't see the entire picture here. For me, since the environment is configured towards requests, there are more environment variables that are requests specific, like REQUESTS_CA_BUNDLE. Thanks for the feedback in the PR btw

Going back to the original subject, would you consider adding support for no_proxy?

jelmer commented 2 years ago

On Fri, Feb 04, 2022 at 02:35:20PM -0800, epopcop wrote:

I guess you're right, I probably don't see the entire picture here. For me, since the environment is configured towards requests, there are more environment variables that are requests specific, like REQUESTS_CA_BUNDLE. Thanks for the feedback in the PR btw

Going back to the original subject, would you consider adding support for no_proxy? Yes, I think we should add support for no_proxy.