benbalter / site-inspector

Ruby Gem to sniff information about a domain's technology and capabilities.
https://site-inspector.herokuapp.com
MIT License
89 stars 29 forks source link

Reconcile 1.x discrepancies #60

Open benbalter opened 9 years ago

benbalter commented 9 years ago

Per @konklone over in https://github.com/benbalter/2016-campaign-tech/pull/2#issuecomment-129070861, there are some changes in output between 1.x and 3.x.

I ran site inspector 1.0.2 against these 15 sites, and then compared their output against the 2.x branch (master).

Looking at:

There were two differences:

In both cases, the root domain redirects to www AND THEN to https. Per @konklone's comments describing the enforces_https? method, to enforce_https, the endpoint must "redirect immediately to HTTPS".

The 1.x branch reports both domains as enforcing HTTPS, despite the intermediary redirect, the 2.x branch follows the comment description.

@konklone which behavior is correct?

benbalter commented 8 years ago

Slightly stronger, yet still gentle bump, @konklone... Are you still using this project? Interested in upgrading to the latest version?

konklone commented 8 years ago

Hey @benbalter -- so yeah, we're still using this in pulse.cio.gov, but we were never quite able to carve out the time to upgrade from the 1.x branch to 3.x. There were more than just the differences noted above across the ~1,300 domains we track.

As I recall, one set of issues related to 3.x's removal of the code that retries upon on TLS failure to determine why the TLS connection failed, as a chain problem is considered very different from a mismatched hostname problem (the former is considered as "supporting" HTTPS, the latter is considered "not supporting").

I'm still up for upgrading, and there have been some bugs in the 1.0.2 behavior we've since caught due to user (mostly in-government employee) reports that I'd like to fix at the same time. It's just a big lift to re-open the box and start testing behavior across 1,300 domains, so it's been easy to keep the status quo.

These are some of the bugs I've found that I believe are in 1.0.2's behavior, that I've wanted to triage:

Would you be up for some joint hacking time sometime? Perhaps where I'm actively working on it but you're there to bounce questions off of, or get advice on how you think to best integrate changes? it would help to focus me, as well as reassure me that it won't be as huge of a time commitment as the 1.0.2 branch turned out to be.

benbalter commented 8 years ago

There were more than just the differences noted above across the ~1,300 domains we track.

Rather than make it an all 1,300 domains must be perfect, or nothing, in terms of ability to contribute (which I can sympathize with being daunting and worth putting off until you have a giant dedicated chunk of time), are there smaller bites we can take to get some momentum going? For example, are you able to describe the changes in behavior (if not the specific sites)? Are there unexpected behaviors beyond SSL that you could submit a targeted pull request for?

konklone commented 8 years ago

I'm sure I could make my requests more discrete -- the issue is that confidently itemizing all the changes is most of the time-suck, and the main project I'm using it in is high-profile enough (among its target audience) that its tolerance for regressions is low.

Clearly inaction is just going to cost more over time -- this is the definition of technical debt. =) I do believe that I'm going to tackle this in January or February of this year, and as I do, I'll do my best to break it into parallelizable chunks.