mozilla / blurts-addon

Mozilla Public License 2.0
5 stars 8 forks source link

Expected behavior w/ sites that have multiple breaches #20

Open pdehaan opened 6 years ago

pdehaan commented 6 years ago

STR:

Search https://haveibeenpwned.com/PwnedWebsites for "Bell (", and you should get 2 results:

Bell (2014 breach) > In February 2014, Bell Canada suffered a data breach via the hacker collective known as NullCrew. The breach included data from multiple locations within Bell and exposed email addresses, usernames, user preferences and a number of unencrypted passwords and credit card data from 40,000 records containing just over 20,000 unique email addresses and usernames. > > **Breach date:** 1 February 2014 > **Date added to HIBP:** 1 February 2014 > **Compromised accounts:** 20,902 > **Compromised data:** Credit cards, Genders, Passwords, Usernames
Bell (2017 breach) > In May 2017, the Bell telecommunications company in Canada suffered a data breach resulting in the exposure of millions of customer records. The data was consequently leaked online with a message from the attacker stating that they were "releasing a significant portion of Bell.ca's data due to the fact that they have failed to cooperate with us" and included a threat to leak more. The impacted data included over 2 million unique email addresses and 153k survey results dating back to 2011 and 2012. There were also 162 Bell employee records with more comprehensive personal data including names, phone numbers and plain text "passcodes". Bell suffered another breach in 2014 which exposed 40k records. > > **Breach date:** 15 May 2017 > **Date added to HIBP:** 16 May 2017 > **Compromised accounts:** 2,231,256 > **Compromised data:** Email addresses, Geographic locations, IP addresses, Job titles, Names, Passwords, Phone numbers, Spoken languages, Survey results, Usernames

Both seem to be for https://www.bell.ca/

Navigate to https://www.bell.ca/ and you get notified of the most recent breach (but not the earlier breach). Not sure if we need to build in some next/previous style navigation for sites w/ multiple breaches, or if that's just too awkward and confusing. Although the first breach only had 21k compromised accounts (versus the 2.2m compromised accounts in the 2017 breach), the first breach did include credit card numbers, so that may be valuable information to share with users. I only did a quick scan of breached domains, and I think that Bell.ca is the only one w/ multiple breaches.

bell-ca

pdehaan commented 6 years ago

Actually, I think I was wrong... it looks like http://www.r2games.com has had multiple (2) breaches as well. Which is slightly interesting, because it isn't showing me the most recent breach. So maybe we're just returning the first result from the breaches.json or something and these aren't sorted.

r2games-com

R2 (2017 forum breach)

In early 2017, the forum for the gaming website R2 Games was hacked. R2 had previously appeared on HIBP in 2015 after a prior incident. This one exposed over 1 million unique user accounts and corresponding MD5 password hashes with no salt.

Breach date: 1 January 2017 Date added to HIBP: 25 April 2017 Compromised accounts: 1,023,466 Compromised data: Email addresses, Passwords, Usernames, Website activity


R2Games

In late 2015, the gaming website R2Games was hacked and more than 2.1M personal records disclosed. The vBulletin forum included IP addresses and passwords stored as salted hashes using a weak implementation enabling many to be rapidly cracked. A further 11M accounts were added to "Have I been pwned" in March 2016 and another 9M in July 2016 bringing the total to over 22M.

Breach date: 1 November 2015 Date added to HIBP: 9 February 2016 Compromised accounts: 22,281,337 Compromised data: Email addresses, IP addresses, Passwords, Usernames

pdehaan commented 6 years ago

OK, I wrote a for reals parser which checks for domains w/ 1+ breaches:

$ node dupes

"" has 9 breaches.
bell.ca has 2 breaches.
forum.btcsec.com has 2 breaches.
r2games.com has 2 breaches.
data4marketers.com has 2 breaches.
$ cat dupes.js

const breaches = require("./breaches.json");

const breachMap = new Map();

breaches.forEach(breach => {
  if (!breachMap.has(breach.Domain)) {
    breachMap.set(breach.Domain, 1);
  } else {
    const domainCount = breachMap.get(breach.Domain);
    breachMap.set(breach.Domain, domainCount + 1);
  }
});

[...breachMap].filter(([domain, count]) => {
  return count > 1;
}).forEach(([domain, count]) => {
  console.log(`${domain} has ${count} breaches.`);
});

NOTE: Empty domain breaches filed as #50; "A few empty domains in breaches.json".

nhnt11 commented 6 years ago

UX suggests: When a domain has multiple associated breaches,

  1. Use the Name of the first breach chronologically.
  2. Get everything else from the last breach chronologically.
pdehaan commented 6 years ago
  1. Use the Name of the first breach chronologically.
  2. Get everything else from the last breach chronologically.

This feels weird to me, but I've been struggling to find a solution that works (apart from adding paging for the ~3 sites with 2+ breaches). We can't reliably sort by severity. Sorting by number of Pwned accounts seems a bit arbitrary... My gut almost just says keep it simple and always show the most recent breach's details (including Name).

Here's the current multi-breach status using the latest breach data from HIBP:

  1. bell.ca

      Bell (2014 breach) (Bell)
      Breach domain: bell.ca
      Breach date: 2014-02-01
      Added date: 2014-02-01T23:57:10Z
      Pwn count: 20902
      Data Classes: Credit cards, Genders, Passwords, Usernames
      Description: In February 2014, <a href="http://news.softpedia.com/news/Hackers-Claim-to-Have-Breached-Bell-Canada-s-Systems-422952.shtml?utm_medium=twitter&utm_source=FredToadster" target="_blank" rel="noopener">Bell Canada suffered a data breach via the hacker collective known as NullCrew</a>. The breach included data from multiple locations within Bell and exposed email addresses, usernames, user preferences and a number of unencrypted passwords and credit card data from 40,000 records containing just over 20,000 unique email addresses and usernames.
    
      Bell (2017 breach) (Bell2017)
      Breach domain: bell.ca
      Breach date: 2017-05-15
      Added date: 2017-05-16T01:49:31Z
      Pwn count: 2231256
      Data Classes: Email addresses, Geographic locations, IP addresses, Job titles, Names, Passwords, Phone numbers, Spoken languages, Survey results, Usernames
      Description: In May 2017, <a href="http://www.cbc.ca/beta/news/technology/bell-data-breach-customer-names-phone-numbers-emails-leak-1.4116608" target="_blank" rel="noopener">the Bell telecommunications company in Canada suffered a data breach</a> resulting in the exposure of millions of customer records. The data was consequently leaked online with a message from the attacker stating that they were &quot;releasing a significant portion of Bell.ca's data due to the fact that they have failed to cooperate with us&quot; and included a threat to leak more. The impacted data included over 2 million unique email addresses and 153k survey results dating back to 2011 and 2012. There were also 162 Bell employee records with more comprehensive personal data including names, phone numbers and plain text &quot;passcodes&quot;. Bell suffered another breach in 2014 which exposed 40k records.
  2. forum.btcsec.com

      Bitcoin Security Forum Gmail Dump (BTSec)
      Breach domain: forum.btcsec.com
      Breach date: 2014-01-09
      Added date: 2014-09-10T20:30:11Z
      Pwn count: 4789599
      Data Classes: Email addresses, Passwords
      Description: In September 2014, a large dump of nearly 5M usernames and passwords was <a href="https://forum.btcsec.com/index.php?/topic/9426-gmail-meniai-parol/" target="_blank" rel="noopener">posted to a Russian Bitcoin forum</a>. Whilst commonly reported as 5M &quot;Gmail passwords&quot;, the dump also contained 123k yandex.ru addresses. Whilst the origin of the breach remains unclear, the breached credentials were <a href="http://web.archive.org/web/20140910190920/http://www.reddit.com/r/netsec/comments/2fz13q/5_millions_of_gmail_passwords_leaked_rus_most/" target="_blank" rel="noopener">confirmed by multiple source as correct</a>, albeit a number of years old.
    
      Yandex Dump (Yandex)
      Breach domain: forum.btcsec.com
      Breach date: 2014-09-07
      Added date: 2014-09-12T04:50:32Z
      Pwn count: 1186564
      Data Classes: Email addresses, Passwords
      Description: In September 2014, <a href="http://habrahabr.ru/post/235949/" target="_blank" rel="noopener">news broke of a massive leak of accounts from Yandex</a>, the Russian search engine giants who also provides email services. The purported million &quot;breached&quot; accounts were disclosed at the same time as nearly 5M mail.ru accounts with <a href="http://globalvoicesonline.org/2014/09/10/russia-email-yandex-mailru-passwords-hacking/" target="_blank" rel="noopener">both companies claiming the credentials were acquired via phishing scams</a> rather than being obtained as a result of direct attacks against their services.
  3. r2games.com

      R2 (2017 forum breach) (R2-2017)
      Breach domain: r2games.com
      Breach date: 2017-01-01
      Added date: 2017-04-25T11:04:29Z
      Pwn count: 1023466
      Data Classes: Email addresses, Passwords, Usernames, Website activity
      Description: In early 2017, the forum for the gaming website <a href="http://www.csoonline.com/article/3192246/security/r2games-compromised-again-over-one-million-accounts-exposed.html" target="_blank" rel="noopener">R2 Games was hacked</a>. R2 had previously appeared on HIBP in 2015 after a prior incident. This one exposed over 1 million unique user accounts and corresponding MD5 password hashes with no salt.
    
      R2Games (R2Games)
      Breach domain: r2games.com
      Breach date: 2015-11-01
      Added date: 2016-02-09T12:20:35Z
      Pwn count: 22281337
      Data Classes: Email addresses, IP addresses, Passwords, Usernames
      Description: In late 2015, the gaming website <a href="https://www.r2games.com" target="_blank" rel="noopener">R2Games</a> was hacked and more than 2.1M personal records disclosed. The vBulletin forum included IP addresses and passwords stored as salted hashes using a weak implementation enabling many to be rapidly cracked. A further 11M accounts were added to "Have I been pwned" in March 2016 and another 9M in July 2016 bringing the total to over 22M.

The problem is, in each case, the Title of the oldest breach, will be very confusing if we use the Description of the newest breach.

Bell (2014 breach) In May 2017, the Bell telecommunications company in Canada suffered a data breach resulting in the exposure of millions of customer records. The data was consequently leaked online with a message from the attacker stating that they were "releasing a significant portion of Bell.ca's data due to the fact that they have failed to cooperate with us" and included a threat to leak more. The impacted data included over 2 million unique email addresses and 153k survey results dating back to 2011 and 2012. There were also 162 Bell employee records with more comprehensive personal data including names, phone numbers and plain text "passcodes". Bell suffered another breach in 2014 which exposed 40k records.

And this could get weird, if a user clicks to the monitor.firefox.com site, and then either sees different titles or different details based on whether we redirect them to the oldest or newest breach.

multibreach.js ```js const breaches = require("./breaches.json"); const breachMap = monitorBreaches(breaches).reduce((map, breach) => { const arr = map.get(breach.Domain) || []; arr.push(breach); map.set(breach.Domain, arr); return map; }, new Map()); const multiBreaches = [...breachMap].filter(([domain, breachArr]) => domain && breachArr.length > 1); for (const [domain, aBreaches] of multiBreaches) { console.log(domain); aBreaches.sort((breachA, breachB) => { return breachB.BreachDate - breachA.BreachDate; }).forEach(breach => { console.log(` ${breach.Title} (${breach.Name}) Breach domain: ${breach.Domain} Breach date: ${breach.BreachDate} Added date: ${breach.AddedDate} Pwn count: ${breach.PwnCount} Data Classes: ${breach.DataClasses.join(", ")} Description: ${breach.Description} `); }); } function monitorBreaches(breaches) { return breaches.filter(breach => breach.IsVerified && !breach.IsRetired && !breach.IsSensitive && !breach.IsSpamList); } ```
nhnt11 commented 6 years ago

@pdehaan This is only for the doorhanger. We're not showing the breach description in the doorhanger anymore - only the Name (not the Title). The point of using the Name of the first breach chronologically was (for example) to make sure we use "Bell" vs "Bell2017". What do you think?

nhnt11 commented 6 years ago

Oops, didn't mean to close this.

pdehaan commented 6 years ago

Oh, interesting, OK... We aren't publicly displaying the breach.Name anywhere on the blurts server, only the breach.Title (the Name is only used for determining the logo image, or in the /?breach={Name} slug).

https://fx-breach-alerts.herokuapp.com/?breach=Bell https://fx-breach-alerts.herokuapp.com/?breach=Bell2017

So basically we're arguing if you want to display:

.Name .Title
"Bell" or "Bell2017" "Bell (2014 breach)" or "Bell (2017 breach)"
"BTSec" or "Yandex" "Bitcoin Security Forum Gmail Dump" or "Yandex Dump"
"R2Games" or "R2-2017" "R2Games" or "R2 (2017 forum breach)"

Yeah, sure. Personally I think the .Title is more user friendly and consistent w/ blurts-server, but this is currently a pretty rare problem to have multiple breaches. Probably not worth the effort/overhead of adding special "This site has had multiple breaches. The most recent breach was %Title%." UI.

If you want to display the .Name, I'd probably vote to use the original breach. If you want to display the .Title, I'd argue that displaying the most recent breach is more relevant.

nhnt11 commented 6 years ago

@pdehaan We need to use .Name over .Title in the doorhanger because of the string into which it gets formatted: "Xyz accounts from FooBar were compromised..."

It doesn't make sense to replace FooBar with "Yandex Dump" for example.

nhnt11 commented 6 years ago

@pdehaan++ Thanks for articulating all of this though, super useful for posterity.