EFForg / https-everywhere

A browser extension that encrypts your communications with many websites that offer HTTPS but still allow unencrypted connections.
https://eff.org/https-everywhere
Other
3.37k stars 1.1k forks source link

Implicit testing of rules #2523

Closed KOLANICH closed 8 years ago

KOLANICH commented 8 years ago

I think that we shouldn't provide tests for rules. Instead test script should discover all the subdomains of domain and test ruleset for each of them and check wheither loop or error code is present, and if they are, treat this as test failure.

reedy commented 8 years ago

How should it discover all subdomains of a domain? From those listed in the ruleset? What happens when it's down as *.bar.com? Or via some other enumeration?

What about for where https://foo.bar.com won't work for whatever reason, but https://foo.bar.com/baz/script.php?

jsha commented 8 years ago

@KOLANICH: As @reedy says, certain types of rules (especially wildcards) make it very difficult to automatically discover the set of URLs to be fetched.

Additionally, fetching only the root of a domain gives an inaccurate picture of the site overall. Eventually I intend to add mixed content and other checking to https-everywhere-checker, at which point it will be advisable to add test URLs for various types of representative pages on a site, to find out if the site regresses from full HTTPS support.

KOLANICH commented 8 years ago

How should it discover all subdomains of a domain?

https://stackoverflow.com/questions/131989/how-do-i-get-a-list-of-all-subdomains-of-a-domain https://security.stackexchange.com/questions/35078/how-can-i-find-subdomains-of-a-site https://webmasters.stackexchange.com/questions/23786/is-it-possible-to-find-all-subdomains-for-a-certain-domain http://www.shellhacks.com/en/HowTo-Get-a-List-of-All-Sub-Domains-of-a-Domain-Name According this there can be at least 3 ways 1 axfr and other means of zone replication 2 dictionary bruteforce 3 google hack

From those listed in the ruleset? What happens when it's down as *.bar.com? Or via some other enumeration?

1 discover all subdomains 2 filter them 3 test the rules on them

reedy commented 8 years ago

I know how you technically can do it. I don't see it seems worthwhile. We have over 16,000 rule files. I know we have many files with > 1 domains... It's going to be a lot of DNS lookups, and axfr etc isn't enabled by default etc

jsha commented 8 years ago

@KOLANICH: As @reedy said, most domains don't enable AXFR. That leaves dictionary bruteforce (automatable, but expensive) and google hack.

The Google hack is a nice one I hadn't seen before. I had been using Wolfram Alpha to look up subdomains. I'll try [site:*.example.com] next time. Still, it is manual, unless we integrate with Google's search API (not sure the status). Even if we integrate with the API there will be a rate limit, which means running the search occasionally and embedding the results in rulesets as test urls. If you'd like to write this integration to help increase test coverage (or to make it easier to write new rulesets that include tests), I would be quite happy with that! But we still need support for test urls in rulesets.

KOLANICH commented 8 years ago

Even if we integrate with the API there will be a rate limit, which means running the search occasionally and embedding the results in rulesets as test urls.

This is a solution. Can we use it with TravisCI (maybe this will need a separate service on eff.org which will give a cached list of subdomains)? Also we can probably use some third-party online services for this (some of them have no captcha, but the results they give are poor).

Moreover using google hack it is possible to generate rules automatically (on request) using the results of inurl:https:// site:example.com and inurl:http:// site:example.com. May be even it would be worthful to include this functionality to addon (when accessing some site if the rule is not present the addon with some small probability triggers eff service for autocreation of the rule for domain).