balvin-perrie / http-to-https

a lightweight automatic HTTPS conversion
9 stars 5 forks source link

Doesn't handle 3rd-party sites #1

Open ghost opened 5 years ago

ghost commented 5 years ago

HTTP to HTTPS 0.1.0 encounters the same issue as another similar extension called Smart HTTPS which is that neither handles 3rd-party sites.

If a site doesn't support https it may nevertheless call insecurely a/several 3rd-party sites which do support https. HTTP to HTTPS doesn't handle those calls.

Example : http://www.cartesfrance.fr/

The site is http only but calls maxcdn.bootstrapcdn.com via http. If I use the HTTPS Everywhere extension, maxcdn.bootstrapcdn.com is called via https, If I use HTTP to HTTPS (or Smart HTTPS), maxcdn.bootstrapcdn.com is called via http.

Imagine moreover the number of 3rd-party sites often called when a browser is not protected by 3rd-party connections by lack of an extension such as uBlock Origin : some http only sites may call an amazing number of 3rd-party sites via http when many of those support https ...

My belief is that any extension aiming to ease the use of https must imperatively consider 3rd-party connections.

balvin-perrie commented 5 years ago

True, this extension at the current state only observes top-level pages. If we aim to evaluate all the web requests, then the only approach would be to have a list of all supported hostnames and do automatic redirection like what HTTPS Everywhere does. Note that the list that is needed for this purpose is really huge (we are talking about a 100MB-ish regexp mapping object) which impacts the browser performance. That's why I decided to write a new lightweight alternative.

I am open to new ideas.

sergeevabc commented 5 years ago

@balvin-perrie, what about a loop like for request in http_requests do try_https else try_http?

balvin-perrie commented 5 years ago

@sergeevabc

  1. web request handling is a synced process. You need to decide to redirect before the actual request is being processed.
  2. browsers do process a huge amount of web requests. Duplicating each is really harming performance even more than the regexp matching