pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.46k stars 3k forks source link

HTTPS certificate verification fails if using proxy pip 7.1.2 #3215

Closed mayukuse24 closed 8 years ago

mayukuse24 commented 8 years ago

When using a proxy (CONNECT via plain HTTP), pip gets the HTTPS certificate verification wrong. Rather than verifying that the certificate received through the tunnel matches the host at the tunnel's end, it compares it to the proxy itself:

Getting page https://pypi.python.org/simple/plog/ Could not fetch URL https://pypi.python.org/simple/plog/: connection error: hostname 'proxy.iiit.ac.in' doesn't match either of 'www.python.org', 'python.org', 'pypi.python.org', 'docs.python.org', 'testpypi.python.org', 'bugs.python.org', 'wiki.python.org', 'hg.python.org', 'mail.python.org', 'packaging.python.org', 'pythonhosted.org', 'www.pythonhosted.org', 'test.pythonhosted.org', 'us.pycon.org', 'id.python.org'

Although this issue was closed at https://github.com/pypa/pip/issues/1905 for pip 1.5.6 . It still shows up for version 7.1.2

sigmavirus24 commented 8 years ago

So my first question is: How did you install pip?

mayukuse24 commented 8 years ago

installed pip with easy_install.

sigmavirus24 commented 8 years ago

And how have you configured proxies? In your pip config file? In the environment? Is it something like https://proxy.iiit.ac.in?

mayukuse24 commented 8 years ago

Configured proxy in the environment "export http_proxy=http://proxy.iiit.ac.in:8080"

sigmavirus24 commented 8 years ago

Verrrrry interesting. @Lukasa thoughts as to why this might be happenin?

chrullrich commented 8 years ago

Works for me with pip 7.1.2 on Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32. Proxy is configured via $http_proxy and $https_proxy, and is used (it would work directly from where I am, so I checked). The proxy is actually a local SSH tunnel to the actual proxy, but that should not matter.

I noticed that it did not use the proxy (connected directly, but still worked) when only $http_proxy was set, but it did after setting $https_proxy as well. @mayukuse24, this does not match your symptoms, but are you sure your pip is actually using your proxy?

In my logs, the access pattern from py -m pip --verbose install pygments (just for testing, and the "py -m pip" is recommended on Windows) is always the same. This is from two calls, the first field is a timestamp:

1446148469.925   4115 192.168.10.2 TCP_TUNNEL/200 682692 CONNECT pypi.python.org:443 - HIER_DIRECT/185.31.17.223 -
1446148470.557    593 192.168.10.2 TCP_TUNNEL/200 51220 CONNECT pypi.python.org:443 - HIER_DIRECT/185.31.17.223 -
1446148481.679   4206 192.168.10.2 TCP_TUNNEL/200 682692 CONNECT pypi.python.org:443 - HIER_DIRECT/185.31.17.223 -
1446148482.383    665 192.168.10.2 TCP_TUNNEL/200 51220 CONNECT pypi.python.org:443 - HIER_DIRECT/185.31.17.223 -

The big request (682692 bytes) is the wheel package, the other one must be the PyPI lookup.

Lukasa commented 8 years ago

@mayukuse24 My guess here is that your environment variables look like this:

"export http_proxy=http://proxy.iiit.ac.in:8080"
"export https_proxy=https://proxy.iiit.ac.in:8080"

That is subtly wrong: setting HTTPS in the scheme of the proxy URL will cause us to create a TLS connection directly to the proxy itself, which would then lead us to attempt to validate the proxy's TLS cert as valid for the remote domain (which it won't be).

Requests (and so pip) does not support tunneling TLS through TLS, so that won't work. If you change your HTTPS_PROXY environment variable to "export https_proxy=http://proxy.iiit.ac.in:8080" that should work.

mayukuse24 commented 8 years ago

@Lukasa i tried what you are suggesting but it doesnt seem to resolve the issue.

@chrullrich I updated packages using pip and it works so i am assuming that pip is detecting my proxy

Lukasa commented 8 years ago

@mayukuse24 Can you show me your _proxy environment variables please?

mayukuse24 commented 8 years ago

env | grep "http"

https_proxy=http://proxy.iiit.ac.in:8080 http_proxy=http://proxy.iiit.ac.in:8080

Lukasa commented 8 years ago

And your errors are the same as the original post?

mayukuse24 commented 8 years ago

yep. i changed the root proxy also just in case. Is it possible that pip is using some other proxy settings where the configuration is as you (@Lukasa) mentioned?

Lukasa commented 8 years ago

No, I don't think so. What version of pip are you using, please?

mayukuse24 commented 8 years ago

I am using pip 7.1.2

Lukasa commented 8 years ago

Hmm, this is quite perplexing. Are you familiar with tcpdump/wireshark?

mayukuse24 commented 8 years ago

I have used wireshark before. No idea about tcpdump though.

Lukasa commented 8 years ago

@mayukuse24 In that case, can you use Wireshark to capture all traffic leaving your machine destined for port 8080 on proxy.iiit.ac.in? I'm interested to see whether this is plaintext HTTP or HTTPS.

mayukuse24 commented 8 years ago

The output file is very large and i don't have write permission to upload it.

But under the protocol i see only TCP TLS and HTTP

Lukasa commented 8 years ago

How large is "very large"?

mayukuse24 commented 8 years ago

4228 lines long

Are you looking for these lines?

No. Time Source Destination Protocol Length Info 1404 56.768318000 10.4.8.204 10.1.39.16 TCP 60 8080→38216 [RST] Seq=1 Win=0 Len=0

4228 lines are made up of these

Lukasa commented 8 years ago

Ideally, you'd save the capture as a .pcapng file, and then store that somewhere. GitHub gists are allowed to be binary files, so that could work.

mayukuse24 commented 8 years ago

I tried placing the .pcapng file but even gist does not accept it.

https://www.dropbox.com/s/x8tjw6emi7tyc11/pipout.pcapng?dl=0

Anyways made a dropbox link of the file

Lukasa commented 8 years ago

So for those who are interested, TCP stream 33 in that capture file is a good example to look at. Here we can see that the CONNECT verb is being properly used, so the problem is not directly in our communication to the proxy, and the correct certificates are being provided.

So this really is us getting this wrong. What version of Python are you using?

mayukuse24 commented 8 years ago

Using python version 2.7.9

Lukasa commented 8 years ago

So, I not observe this with Python 2.7.10 and requests 2.8.1, running Charles on my local box. Let me try with pip directly.

Lukasa commented 8 years ago

Yup, when running against a Charles proxy on my local host, I do not encounter this problem.

@mayukuse24 let's try this a new way. With your environment set up as it is (i.e. with your environment variables in place), can you run the following Python code?

import requests

r = requests.get('https://mkcert.org/generate/')
print r.content

You may need to download a requests tarball from upstream manually to do this if you don't already have requests.

mayukuse24 commented 8 years ago

@Lukasa The output of the code you asked to run:-

https://gist.github.com/mayukuse24/914bde018f5fa0441e4b

Lukasa commented 8 years ago

@mayukuse24 And that ran via your proxy, yes?

mayukuse24 commented 8 years ago

yes it did run via my proxy

Lukasa commented 8 years ago

Which version of requests did you run?

mayukuse24 commented 8 years ago

requests version 2.8.1

Lukasa commented 8 years ago

I don't believe any major changes have happened, but just in case can you try again with requests 2.7.0? (That's the version pip 7.1.2 uses)

mayukuse24 commented 8 years ago

output of requests 2.7.0

https://gist.github.com/mayukuse24/4e7cf890ff0efc1e2eed

Lukasa commented 8 years ago

Yup, so requests is working fine. That suggests this is a pip problem.

Lukasa commented 8 years ago

@dstufft Do you do any tweaking of proxy stuff in requests? Any setting of proxy values or anything like that?

dstufft commented 8 years ago

https://github.com/pypa/pip/blob/develop/pip/basecommand.py#L88-L93

Lukasa commented 8 years ago

Hmm. @mayukuse24 are you setting proxies specially for pip?

mayukuse24 commented 8 years ago

@Lukasa pip install only works if the env proxy variables are set:-

https_proxy=http://proxy.iiit.ac.in:8080 http_proxy=http://proxy.iiit.ac.in:8080

Therefore i am assuming that these are the proxy variables being used by pip

Lukasa commented 8 years ago

Where by the word "works" you means throws the error above?

mayukuse24 commented 8 years ago

If i install a package using pip then it works out. However the error is throwing up when another python script is trying to install using pip.

sigmavirus24 commented 8 years ago

Oh that was completely not obvious from your original report. In that case if you're using Popen you need to be sure it passes those environment variables to what it's using.

aminorex commented 8 years ago

Can't we get an option to disable certificate checks? I always end up monkey-patching to work-around this.

connaryscott commented 8 years ago

python and proxies is absolute hell when connecting to a tls endpoint. It works on my mac does not work in ubuntu on version 2.7.10

dstufft commented 8 years ago

I'm going to close this. It appears that pip was actually working in the original post but environment variables weren't correctly being passed when a script was calling pip via a subprocess. As far as disabling certificate checks go, I believe --trusted-host allows that already.

haridsv commented 7 years ago

Just for future reference, I faced this exact same issue when running Ansible pip module to implicitly create virtualenv and install packages, using something like this:

    - name: Install pip packages
      pip:
        name: "{{item}}"
        virtualenv_command: /opt/python27/bin/virtualenv
        virtualenv: /var/lib/jenkins/python
        virtualenv_python: /opt/python27/bin/python
      with_items: "{{pip_packages}}"
      environment:
          https_proxy: "{{https_proxy}}"
      tags: pip

I have the below installed at the system level:

$ python -V
Python 2.7.5
$ pip --version
pip 1.3.1 from /usr/lib/python2.7/site-packages (python 2.7)
$ virtualenv --version
1.10.1

But the one at /opt/python27 has the below versions:

$ /opt/python27/bin/python -V
Python 2.7.13
$ /opt/python27/bin/pip --version
pip 9.0.1 from /opt/python27/lib/python2.7/site-packages (python 2.7)
$ /opt/python27/bin/virtualenv --version
15.1.0

In this case, the Ansible pip module is somehow picking up system pip which is very old and seems to have a bug in proxy handling. There is an executable attribute to point the module to the specific pip to use, but when this option is used, you can no longer create a virtualenv implicitly. I don't fully understand what is happening here yet, but when creating virtualenv implicitly, I think the Ansible pip module is picking up the wrong pip executable path. The workaround in my case is to create virtualenv explicitly first and then use executable option, like this:

    - name: Create Python virtualenv
      command: /opt/python27/bin/virtualenv --no-download /var/lib/jenkins/python
      tags: pip

    - name: Install pip packages
      pip:
        executable: /var/lib/jenkins/python/bin/pip
        name: "{{item}}"
      with_items: "{{pip_packages}}"
      environment:
          https_proxy: "{{https_proxy}}"
      tags: pip

I am actually not sure if there is any gain using the Ansible pip module over natively running the pip command in this case, so one could just directly run the native pip.

I just thought of adding this information here as this comes up as the first result in google for this error.

haridsv commented 7 years ago

Just to clarify the below closing statements:

It appears that pip was actually working in the original post but environment variables weren't correctly being passed when a script was calling pip via a subprocess.

The issue was not with the environment variables, in fact if the variable was not passed correctly, pip wouldn't even complain about the proxy. In my case without proxy variable, it will just hang and timeout because of firewall restrictions.

As far as disabling certificate checks go, I believe --trusted-host allows that already.

I tried this too before coming up with the above workaround, and this didn't help. I think the option is only applicable for the download servers, not for the proxy host itself.

Lukasa commented 7 years ago

@haridsv It should be noted that using $HTTPS_PROXY set to a URL beginning https:// is not really supported by Requests (the underlying HTTP layer), and can lead to all kinds of odd errors. You'd need to set the URL to http:// to avoid those.

haridsv commented 7 years ago

@Lukasa I have https:// in my HTTPS_PROXY variable and it seems to be working fine with pip as well as most other tools, such as curl and Java. I also tried to see if using http:// will help with the pip error here, but it didn't. However, I will keep this in mind and watch out for any weird issues from Python tools, thanks for the suggestion.