encode / httpx

A next generation HTTP client for Python. 🦋
https://www.python-httpx.org/
BSD 3-Clause "New" or "Revised" License
13.37k stars 854 forks source link

Drop certifi, use system trust store by default #302

Open sethmlarson opened 5 years ago

sethmlarson commented 5 years ago

This is an issue that many people have been trying to tackle for Python but hasn't been done yet. I think we're in a pretty good position to try to tackle this in a way that can be available to everyone.

This should probably be implemented as a separate library, maybe use a stripped-down oscrypto project as a starting point.

Reasons to do this:

yeraydiazdiaz commented 4 years ago

It's been a while since this was raised has been brought up recently a couple of times, could we go into detail of what's needed here?

For reference oscrypto lists other related libraries, would any of those fit our needs? (I'd look myself but I just don't know enough about the subject 🙂 )

florimondmanca commented 4 years ago

Hey!

So considering the amount of :+1:'s on this issue I thought it would make sense to take some kind of decision on this as part of 1.0 - https://github.com/encode/httpx/issues/947#issuecomment-629698226

I'm personally on the side of dropping certifi as well, mainly because 1/ it's an extra dependency and 2/ it's probably better for certifi to be opt-in (e.g. we can hint it in the Requests migration guide) rather than opt-out. Other reasons mentioned by Seth in the issue description also apply.

I'm pretty much a n00b in terms of TLS implementations, let alone system compatibility, so let me ask what I feel might be a n00b question but maybe not after all…

What if we just used ssl.create_default_context()?

From the docs (emphasis mine):

Return a new SSLContext object with default settings for the given purpose. The settings are chosen by the ssl module, and usually represent a higher security level than when calling the SSLContext constructor directly. (So it's supposed to be secure enough choice, right?)

cafile, capath, cadata represent optional CA certificates to trust for certificate verification, as in SSLContext.load_verify_locations(). If all three are None, this function can choose to trust the system’s default CA certificates instead.

TL;DR: it looks to me ssl.create_default_context() should* create an ssl_context that uses the system defaults. Is there anything else we'd need? Or would we not be considering this because it's not the always most secure choice everywhere?

(The usage of "can*" in the docs looks suspicious to me. Are there cases when Python wouldn't trust the default system CA? And what happens if it doesn't? Would the default ssl_context then be a "no verify" context?)

I'm wondering if this wouldn't be an okay default, mainly because two notable projects use and/or recommend the use of ssl.create_default_context() for configuring TLS:

tomchristie commented 4 years ago

Not my core expertise, but pretty sure it's not a solved issue.

I sat down with @lukasa last year briefly, and one of the technical issues we talked about was dropping certifi, and the complexities of accessing the system trust store on different O/S's.

I'm sure @sethmlarson has a much better handle on the current state of affairs here. I think either @tiran's or @glyph's name might also have come up in my conversation with Cory, wrt. folks having done some work in this area, but I might be getting that wrong. Kinda a PyCon type thing that would've benefited from being able to chat it out with relevant folks, but...

In any case, assuming I have got the landscape correct here, I think "use system trust store by default" sits firmly in "make this work in an independent, tightly-scoped third party package", rather than specifically "solve this in httpx".

If and when such a package exists, then yup we can take a look at using it, otherwise it's in the realm of "this isn't a resolved issue in the Python ecosystem, certifi is the best we can do until then".

florimondmanca commented 4 years ago

Just found out about this discussion starting back from 2016 in the Requests repo, loads of good background there: https://github.com/psf/requests/issues/2966

Edit, after reading it:

it's in the realm of "this isn't a resolved issue in the Python ecosystem, certifi is the best we can do until then".

Agreed.

(Also I think I found an answer there to "Wouldn't create_default_context() be enough?", which is "yes, but only on *nix, and only if OpenSSL is linked correctly". We already support passing a preconfigured ssl_context as verify, so we probably cover enough ground there already.)

tiran commented 4 years ago

I need to finish my prototype...

florimondmanca commented 4 years ago

Another data point in favor of us keeping certifi at this time: aiohttp not bundling certs by default seems to have been causing users a lot of pain… https://github.com/aio-libs/aiohttp/issues?q=is%3Aissue+verify+failed+

In a lot of these issues the solutions are often a mix of "disable cert validation" or "use certifi".

Eg https://github.com/aio-libs/aiohttp/issues/955

tiran commented 4 years ago

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=960869 should help to solve a bunch of problems. Fedora / RHEL should have root CA certificates installed. I'll verify UBI mini tomorrow.

ofek commented 4 years ago

@tiran How far along is that prototype? 😄

AstraLuma commented 3 years ago

A short-term solution might be to support pulling a CA bundle from environment variables (requests also supports this). This gives a knob for system administrators to make httpx behave.

On the prior art pile, urllib3 supports SecureTransport, but not SChannel.

EDIT: Oh, they stole code from https://github.com/wbond/oscrypto which looks a lot more complete

Since this hasn't been mentioned in the thread, TLS has several knobs with regards to security, the big ones being algorithm selection and server certificate policies. Per the docs:

The settings are: PROTOCOL_TLS, OP_NO_SSLv2, and OP_NO_SSLv3 with high encryption cipher suites without RC4 and without unauthenticated cipher suites. Passing SERVER_AUTH as purpose sets verify_mode to CERT_REQUIRED and either loads CA certificates (when at least one of cafile, capath or cadata is given) or uses SSLContext.load_default_certs() to load default CA certificates.

This is basically drops the horribly insecure options and presents an ok default. (I am not enough of a security expert to say how good these defaults are.) In general, OpenSSL is complex, probably more complex than necessary.

AstraLuma commented 3 years ago

Also, a moment of silence for PEP 543.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ofek commented 2 years ago

Bump.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

tomchristie commented 2 years ago

Linking to Seth's work on this... https://github.com/sethmlarson/truststore

dgasmith commented 2 years ago

We're hitting this issue as well when using ZTNA, for requests we are able to set the REQUESTS_CA_BUNDLE environment variable (source). Happy to make a PR here as well if an environment level is a workable solution.

We can implement the cert logic at the application level; however, this will lead to inconsistent behavior as the use of HTTPX grows.

tomchristie commented 2 years ago

for requests we are able to set the REQUESTS_CA_BUNDLE environment variable (source). Happy to make a PR here as well if an environment level is a workable solution.

Are SSL_CERT_FILE/SSL_CERT_DIR what you're looking for?

https://www.python-httpx.org/environment_variables/#ssl_cert_file

We're hitting this issue as well

Can you be more specific?

tomchristie commented 1 year ago

Using truststore with httpx...

import httpx
import ssl
import truststore

ssl_context = truststore.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
client = httpx.Client(verify=ssl_context)
ytjohn commented 1 year ago

I am new to httpx and ran into this problem as well.

https://www.python-httpx.org/environment_variables/#ssl_cert_file

EDIT: I misread it the first time. Setting SSL_CERT_FILE environment like I do REQUESTS_CA_CERT works.

zanieb commented 1 year ago

@ytjohn as noted by Tom, SSL_CERT_FILE is a standard used by OpenSSL and other tools like Golang — does that not do what you want?

ytjohn commented 1 year ago

@madkinsz my bad, I must have developed a reading comprehension failure. Yes, SSL_CERT_FILE solves the problem.

tomchristie commented 1 year ago

I'd suggest that we add this example to our documentation here either verbatim or as a gist, and then close this issue.


Related... I believe that our ssl context API would be better as httpx.get(..., ssl_context=...) than our currently verify=.../cert=... API... see https://github.com/encode/httpx/discussions/2521

AstraLuma commented 1 year ago

Please make it clear to admins (not just developers) how to set this. The problem is that most developers only care about this in the "please forward this option" sense, and laziness/ignorance means they won't.

It is dependent on the environment the end software is used in what is the reasonable default. But as someone that's been the enterprise sysadmin that's thrice-cursed certifi's lack of overrides, the use of truststore would have made my job much easier.

gpongelli commented 11 months ago

Using truststore with httpx...

import httpx
import ssl
import truststore

ssl_context = truststore.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
client = httpx.Client(verify=ssl_context)

you saved my day! please add this example into documentation 👍

wimvanleuven commented 10 months ago

Problem with the proposed solution is that this only work for direct uses of httpx. If using a library wrapping httpx, we have no means of injecting the truststore or SSL context, which is an added value of using env vars like REQUESTS_CA_BUNDLE, SSL_CERT_FILE or installing pip-system-certs on windows.

HTH

mehrdad-shokri commented 4 months ago

I'm using a package which uses httpx as it's http client. there is no constructor arguments to pass client or the httpx verify arg to it. basically stuck here.

tomchristie commented 1 month ago

So @sethmlarson's truststore looks like it's mature now. Worth considering the switch, perhaps?

sethmlarson commented 1 month ago

Truststore is used by pip and PDM, so should be good to adopt if you are still interested. Still requires Python 3.10 or later, though. We also already have integration tests for HTTPX specifically: https://github.com/sethmlarson/truststore/blob/main/tests/test_inject.py#L116

tomchristie commented 1 month ago

Yup, very much interested.

Still requires Python 3.10 or later, though.

Okay, that's a consideration for us, tho we could potentially have a fallback to certifi for 3.8 and 3.9.

Truststore has also been integrated into pip 24.2+ as the default method for verifying HTTPS certificates

I knew truststore was an option in pip, I didn't realise it was now on-by-default. Impressive.

ofek commented 1 month ago

I would love if it became the default on 3.10+ for v1.0! It's what I plan to do for Hatch anyway so it would save me time and it's what is desirable everywhere I've ever worked.

ofek commented 1 month ago

It's worth noting that the Rust community is also going in the direction of making system certificate verification the default with this new package that everyone is transitioning to https://github.com/rustls/rustls-platform-verifier

zanieb commented 1 month ago

It's worth noting that the Rust community is also going in the direction of making system certificate verification the default...

Hm interesting. uv does not use system certificates by default (and we don't really get complaints about it).

ofek commented 1 month ago

It looks like you do https://github.com/astral-sh/uv/pull/1512

Edit: ah by default you mean but is supported. I assume by not getting complaints is because you have support for it via an option