psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.14k stars 9.33k forks source link

requests.packages monkey patching of unbundled urllib3 does not work #3287

Closed jayvdb closed 7 years ago

jayvdb commented 8 years ago

In requests 2.5.2 requests.packages.__init__ added code to allow requests.packages.urllib3 to not exist, and it would fall back to using urllib3.

No doubt this works in some cases, as the people on #2375 were happy with it, but it doesnt work with pyopenssl, resulting in no SNI/etc.

requests.packages.__init__ only injects urllib3 as requests.packages.urllib3.

However all of the other modules within urlllib3 will have different copies in the namespace requests.packages.urllib3.

requests.packages.urllib3.contrib.pyopenssl then writes to a separate copy of requests.packages.urllib3.util and requests.packages.urllib3.connection, and these changes are never seen by urllib3.

I encountered this on debian stretch & sid's python-requests 2.10.0-1 with python-urllib3 1.15.1 when running on Python 2.7.6 (Ubuntu Trusty), and haven't retested yet on a later Python 2.7. @eriol

3286 fixes the problem for me.

I note that Fedora's latest python-requests no longer unbundles urllib3.

jayvdb commented 8 years ago

Here is a set of tests which demonstrate some of the problem: https://travis-ci.org/jayvdb/requests/builds/135716555

My apologies that it doesnt show a problem in test_unbundling, which is what I had hoped would show the problem more clearly.

Instead, see the errors in test_verify have:

('HAS_MODERN_SSL', False)
('HAS_PYOPENSSL', True)
('HAS_SNI', True)

and yet they have warnings indicating that those requests are not being processed securely.

eriol commented 8 years ago

@jayvdb thanks for testing but there is something I don't get. Tests for DIST=trusty+sid seem to use Python 2.7.6, but Stretch has Python 2.7.11. Even the stable release, Jessie, has 2.7.9 with several backports (for example cPython >= 2.7.9 has ssl features backported from Python3).

jayvdb commented 8 years ago

Hi @eriol, the DIST=trusty+sid job in that build is mostly Travis' trusty, with selected packages added from sid. Travis trusty reports that it is Python 2.7.6, whereas trusty ships with 2.7.5 as far as I can see, so I am a bit confused about that, but it isnt a job I am particularly focused on, as the same problems appear in the other jobs that have a cleaner virtual env at the beginning of the test sequence.

As a consequence of building the additional testing on Travis, I feel more confident that any distro version shipping Python 2.7.9+ is fine with the current requests.packages unbundling code. I still have some more test scenarios to create, though.

So the only 'problem' may be that the requests 2.10 package in stretch / sid states it Depends: python:any (>= 2.7.5-5~) , which is why it can be installed onto these Travis trusty environments, and probably any other debian derivatives which are still on Python 2.7.5 - 2.7.8 (are there any?).

Arch Linux also uses the unbundling, but it appears to be only providing Python 3.5. Contrary to what I said earlier, Fedora isnt using the unbundling fallbacks in requests.packaging.__init__, but they are still using symlinks, however they are Python 2.7.10+ also, so should be safe anyway.

jayvdb commented 8 years ago

@eriol, I've been able to provide a better test case for this, and it shows that the problem exists even in 2.7.11 and 3.3.

https://travis-ci.org/jayvdb/requests/builds/135867398 is just #3289 with a .travis.yml that shows various combinations all work well with the bundled version of urllib3.

https://travis-ci.org/jayvdb/requests/builds/135870520 is a very simple change to the .travis.yml that emulates what the Debian package looks like, and shows that SubjectAltNameWarning stops occurring on all environments that are using pyopenssl. As explained in the opening issue, under the covers it is more than just the warning that isnt happening. A second copy of the modules are being created and configured for pyopenssl mode, and the actual urllib3 doesnt get into pyopenssl mode.

Finally, https://travis-ci.org/jayvdb/requests/builds/135873388 is #3286 , which fixes the problem. That patch isnt intended to be the final solution; it is a WIP until we figure out what should be merged (I was told in #2670 to PR early), intended mostly to show what does work. I have very quickly looked at the pip approach, and it is doing roughly the same thing so it should work. It does require closer coupling between pip/requests and urllib3, which my patch avoided, for better and for worse. I am not pushing to have my patch, or any other similar patch, merged, pushing requests further down this 'support unbundling' rabbit hole further, if the maintainers don't feel it is appropriate. The patch is there to prove the bug exists.

My next step is going to be to check what happens with symlinks like what Fedora is doing, to see if that is a way to beat the import machinery. (I'd be surprised if it didnt work, but this problem is full of surprises).

jayvdb commented 8 years ago

symlink results are in, with failures up until 2.7.8 only, and the failures are the opposite -- there are too many warnings, rather than too few.

jayvdb commented 8 years ago

I've put together a little playground to show what is happening under the covers, as I found it difficult to do this under pytest, in order to better see the scope of the problem. https://github.com/jayvdb/requests_issue_3287

Worth noting that 2.7.9+ seems to mostly do the right thing, but I havent yet added any demos using site using/abusing Subject Alternative Name.

jayvdb commented 8 years ago

@warsaw , as I saw you're working on a similar package and de-vendoring ( https://github.com/pypa/pip/issues/3802 ), you might like to be aware of, or assist with, this.

warsaw commented 8 years ago

@jayvdb Thanks for the heads-up. That was a quick fix which seemed to work for Debian. I'll take a look at this issue in more detail when I get a chance.

eriol commented 8 years ago

Hi, i did non forgot this. Only busy during this week.

@warsaw my was plan is to use same approach of pip here. What do you think about?

warsaw commented 8 years ago

@eriol +1 I think pip has done the best job of making devendorizing easier for downstream redistributors. My deltas are very small and now that it's been in place for a while, I haven't encountered any issues with it. It's possible I'm missing something, so pinging @dstufft for additional thoughts.

jayvdb commented 8 years ago

Once #3289 is merged, we'll have a unit test to check the de-vendored solution.

eriol commented 8 years ago

@jayvdb many thanks for #3289! I'm going to work on this in a few hours or at max tomorrow. Feel free to ping me again if you did not get a report from me tomorrow afternoon. But I hope to be able to work on this today.

dstufft commented 8 years ago

We've gone through many iterations of the debundling with pip. I think we're pretty happy with the current incantation which has the following properties:

The downsides that we've discovered so far:

jayvdb commented 8 years ago

@dstuff, I very quickly tested pip, and it inherits this problem from requests. pip de-vendoring may well be working wonderfully, but the de-vendored requests isnt initalising urlllib3's pyopenssl properly. See https://travis-ci.org/jayvdb/requests_issue_3287/builds/138597348 , which shows only additional warnings being emitted for pip's requests, but please see the rest of this issue to see why those warnings are occurring and the impact of that. Due to pip mostly being used to talk to one set of servers, the fallout is smaller. My guess is that this isnt a security vulnerability , as it will cause connections to fail when they should succeed, rather than connect when it shouldnt. But I have very little experience in this area, and not much time to understand the corner cases in these packages, so I am sure others here will know the real potential for this to be abused or not.

The pip travis testing framework for de-vendored packages is nice, but note that https://github.com/pypa/pip/blob/master/pip/_vendor/vendor.txt isnt including requests[security] , so that isnt accurately representing distros. Not that it matters a great deal, as there doesnt seem to be any pip tests covering problematic sites needing better ssl support (not surprising; pip expects requests to do that). And, fyi, the pip tests suite are failing atm.

eriol commented 8 years ago

@jayvdb fast report, working on unbundling stuff right now. Testing with your https://github.com/jayvdb/requests_issue_3287.

eriol commented 8 years ago

@dstufft many thanks for the detailed explanation!

I have just uploaded requests 2.10.0-2 to unstable (still in upload queue, it'll show up shortly). I cherry picked the unbundling stuff from pip, this is the patch landed on Debian.

@jayvdb can you test again when requests 2.10.0-2 will be in the archive? Thanks! Also, what about renaming this bug in a more specific way? Does not work seems to broad to me: we are addressing a problem using Python2 and related to SSL.

One more thing, I can bump the Python dependency to ensure a cPython2 with SSL feature from Python3, but only after we fix this. I was not aware of trusty+sid combo, I can understand the use, but mixing packages from different release seems dangerous to me.

jayvdb commented 8 years ago

I have limited capacity to test atm, in a bit of a rush to airport, but should be good in ~6 hrs.

Regarding issue title, this problem also occurs only py3. On py34 it doesnt have the same impact, as the native ssl is better, but the pyopenssl monkey patching is still not working properly. See here on py34 https://travis-ci.org/jayvdb/requests_issue_3287/jobs/136099126#L390

jayvdb commented 8 years ago

Also there are other packages that requests vendors the same way, so obscure problems can probably be found there also. It is the monkey patching that is wrong, not an ssl issue. It is just the ssl side effect is the most concerning of course.

jayvdb commented 8 years ago

I've done some basic testing. The warnings are fixed. It appears that Debian's 2.10.0-2 requests.packages.urllib3 is now a functioning monkey patched copy of urllib3.

However requests.packages.urllib3 is not urllib3! Any code which relies on Debian's requests enabling the top level urllib3 package pyopenssl mode , or expects requests' urllib3 to have the same state as the top level urllib3 package, will still have that expectation broken. Im not sure if this is a problem, but I note that requests/packages/__init__.py stresses that `requests.packages.urllib3 is urllib3 was an objective (and that it had achieved that goal, which is true, but not terribly useful when many of the modules in the two namespaces had different copies with different state ..).

You can see Debian's requests 2.10.0-2 has different state to urllib3 at https://travis-ci.org/jayvdb/requests_issue_3287/builds/138669983#L289 , but that is because requests on Debian now copies urllib3 and doesnt enable pyopenssl in the original top level urllib3 package which is left in its default state as my test script doesnt attempt to enable pyopenssl in urllib3.

lilydjwg commented 7 years ago

Hi, distro user here, and today I've met exactly the same issue. Can someone explain to me why requests bundles so many packages? For easy-of-use? Users need only one big package in this way. But pip has done a good job on installing dependencies for these users. So it is for pinning known-compatible versions? Or something else I've totally missed?

sigmavirus24 commented 7 years ago

@lilydjwg pip does a good job on installing dependencies if there are no conflicts. Consider the following case:

There's a version of Requests, let's call it vR.0.0 that requires a specific version of urllib3, let's call that vU.0.0 and anything prior to vU.0.0 will break something in Requests. Let's also say that the user uses the python elasticsearch package and pin urllib3 to vT.0.0 which comes before vU.0.0. If their requirements.txt file looks like this:

requests==R.0.0
elasticsearch-py==E.0.0
urllib3==T.0.0

Then they will have a broken Requests installation. Vendoring the dependencies makes it easy to know that Requests will always work when installed from PyPI (we cannot ever make the same guarantee for a distro user, that guarantee is up to the packagers to make).

Both Requests and urllib3 are widely used and popular packages. Requests will sometimes want to stick to an older version or use a newer version and this allows us complete control over that.

Keep in mind, however, that pip has GSOC student who is working on resolving the above problem. That will be one less issue for Requests to worry about then once that lands. Unfortunately, people don't often upgrade pip. Namely, the people who rarely update pip are users such as yourself who rely on their distro to provide them python packages. In your case, you will likely run into this issue unless you're using a alpha/beta version of your distro or something like Arch Linux which does it's best to stay current.

lilydjwg commented 7 years ago

@sigmavirus24 thanks for the explanation!

I'm on Arch Linux. However companies tend to use stable distros with terribly-old packages, so I install pip and upgrade it to latest every time :-)

sigmavirus24 commented 7 years ago

Closing since Requests unvendored everything recently.

sakshi094 commented 7 years ago

data['created_by'] = request.headers.get(header.GRASS_HEADER_USER_ID, '') how to mock it using monkeypatch any idea ?

sigmavirus24 commented 7 years ago

@sakshi094 that's not what this issue is about. I suggest you ask your question on StackOverflow.