Closed divergentdave closed 8 years ago
Looks like I introduced an error into the commerce scraper, so don't merge this yet.
This PR doesn't contain any edits to inspectors/commerce.py
-- are you sure it needs to be held up?
Heh, good point. Must be caused by a change in the site then.
@konklone @divergentdave Thanks, Just read up on some of the background on this, it was news to me. It seems like while upgrading the root to sha-2 is recommended, stuff wont start breaking until 1/17? Or are you saying functinality is being impacted now and we should look at upgrading earlier?
OK, so yeah -- It looks like www.house.gov
returns a root certificate in the chain, and the root happens to be SHA-1. Including any root certificate in the chain is superfluous, since the root cert should be referenced from a local root store rather than the chain itself.
SSL Labs shows the Baltimore CyberTrust Root
SHA-1 root included in the chain. It flags this with a warning of Extra certs
, too:
https://www.ssllabs.com/ssltest/analyze.html?d=house.gov&s=23.193.9.49&latest
However, this is fine for browsers, because SHA-1 deprecation in browsers doesn't impact the validation of root certificates. Root certificates are not validated using their signature. This is acknowledged in Google's original SHA-1 deprecation blog post, and on the SSL Labs report ("Weak or insecure signature, but no impact on root certificate").
So @joelcollinsdc, I don't think there's any SHA-1 deprecation issue you need to worry about. However, you do have an extraneous cert in the chain.
@divergentdave Is there anything www.house.gov
could do to fix the error...? It looks like its root just isn't trusted in the certifi bundle?
Okay, actually looked at the chain the server is sending, and the last certificate is not self-signed, rather it is a cross-signed certificate for the Baltimore CyberTrust Root
CA, signed by GTE CyberTrust Global Root
. While SSL Labs calls it an "extra certificate," it could totally be useful with older clients. The discussion on this Chromium issue suggests that this cross-signed cert is required for Froyo-vintage Android. I think leaving it as-is would be the best course of action.
From the certifi point of view, Baltimore CyberTrust Root
appears in both certifi.where()
and certifi.old_where()
, wheresa GTE CyberTrust Global Root
appears only in certifi.old_where()
, since it is being phased out. If we were using OpenSSL 1.0.2, everything would just work with the more secure set of roots. However, since we're using OpenSSL 1.0.1, it follows the chain all the way down, ignores the fact that Baltimore CyberTrust Root
is in its trusted store, and only checks whether GTE CyberTrust Global Root
is in the trusted store. Thus, we have to use certifi.old_where()
in this case.
Followup to #248, here's a big batch of HTTPS upgrades. In the coming days, I'll check remaining HTTP scrapers again to see if there are any more that can be upgraded.