EFForg / https-everywhere

A browser extension that encrypts your communications with many websites that offer HTTPS but still allow unencrypted connections.
https://eff.org/https-everywhere
Other
3.36k stars 1.09k forks source link

Various issues in Python package metadata URLs #18867

Closed jayvdb closed 1 year ago

jayvdb commented 4 years ago

I've done a crawl of a significant subset of Python related packages and their metadata, and checked the https status of their websites. I am interested in which of these are problems suitable for fixing with https-everywhere. Higher than average technical skills can be assumed as these domains are/were used by software producers, and I intend to inform some of the developers, but many are old unmaintained packages that are still heavily used, where I doubt any fix would be prompt, so opinion on how long to wait for fixes upstream before adding https-everywhere rules would be appreciated. I dont see much such advice in the docs about process for additions/modifications. At what point do users needs outweigh trying to get upstream to fix the problem? e.g. one of the items below is packages.python.org , which is quite a prominent site, and I've reported it upstream but not seen any response yet (I will try it fast tracked by reporting it closer to the admins).

parked domains with https problems

https very slow while http ok

nothing on https (I guess these are out of scope for https everywhere?)

wrong cert

Github Pages

readthedocs.io

Self signed

Expired

strict cert verify failure - ok in Firefox, not ok with python requests secure mode

jayvdb commented 4 years ago

A bit of analysis indicates http-everywhere doesnt like explicitly downgrading from https to http when necessary. I am drawing that conclusion mostly from the fact that

There are zero enabled rules which match from="https:... afaics - the following are all disabled/false positives I believe.

> git grep -n '\(from="^https\|to="http:\)'
Cato-Institute.xml:44:  <!-- <rule from="^https://www\.cato\.org/([^/]+/?(?:[^/]+/?)?)?$"
Cato-Institute.xml:45:          to="http://www.cato.org/$1" downgrade="1" /> -->
Epson.xml:71:   <rule from="^https://(?:www\.)?epson\.com/((?:[a-zA-Z][a-zA-Z\d]+){1})$"
Fasthosts.xml:34:               <rule from="^https://www\.fasthosts\.co\.uk/js/track\.js"
Tesco.xml:177:  <!--rule from="^https://secure\.tesco\.com/"
Tesco.xml:178:          to="http://www.tesco.com/" downgrade="1" /-->
Yandex.xml:641: <!--rule from="^https://static-maps\.yandex\.ru/"
Yandex.xml:642:         to="http://static-maps.yandex.ru/" downgrade="1" /-->

There are 3518/24931 rules with exclusions, so there is a decent attempt at defining http-only resources, but it seems that is mostly occurring when there are parts of the same domain which has http->https rules.

> git grep 'exclusion pattern="^http:' | wc -l
3518
> ls | wc -l
24931

Am I deriving too much policy/process guidance from the existing dataset?

pipboy96 commented 4 years ago

@jayvdb We try to never touch HTTPS requests if even possible, and have explicitly disallowed downgrading HTTPS to HTTP quite a long time ago.

jayvdb commented 4 years ago

Ok, good to know that downgrading rules are not permitted - not surprising. That eliminates one set of policy questions. In my next run of the job, I'll pay more attention to where my logic achieves the http->https transition successfully, and try to get them into the rulesets where missing.

Many of the items I listed above can be fixed by replacing http/https at custom domain with https at github.io/rtd.io/other custom domain, and I see this happening frequently in existing rules. So there is still the policy/process question about how long to wait for 'upstream' to fix their website before adding rules here to bypass the problems.

cjwelborn commented 4 years ago

I closed the issue related to my domain because I was able to fix it. The combination of GoDaddy and GitHub-Pages was a pain for a long time when trying to enable HTTPS. The DNS management interface is minimal, and they expect you to be a DNS expert. Fortunately, there are better tutorials, bug reports, and help topics on the issue now.