ignis-sec / Pwdb-Public

A collection of all the data i could extract from 1 billion leaked credentials from internet.
MIT License
3.03k stars 398 forks source link

The mystery list #8

Open oleyka opened 4 years ago

oleyka commented 4 years ago

The list is, indeed, mysterious. Interestingly, even though you had a huge dataset to start with, it is missing several passwords that match the pattern, and appear in a ton of records in HIBP, which means the 763K password list is hardly exhaustive.

"tgPw53j3kG" shows up 4354 times in HIBP "odz1w1rB9T" appears 3769 times "ZZ8807zpl" appears 7508 times

Any chance you could match the passwords to emails they were used with, to see if there's a pattern? E.g., in the case of the passowrds above the first one shows up primarily next to gmail.com addresses in my (very limited) dataset, whereas the other two belong to hotmail users with very similar usernames (but not always! there are exceptions, too). It hints me that these could be either mass account takeovers where the attackers woudl reset all passowrds to a single password, or auto-generated email accounts used for botfarms.

oleyka commented 4 years ago

Looking at the top passwords in Troy Hunt's database, published by @roycewilliams here: https://gist.github.com/roycewilliams/eef06c1148707ce8c8a1dea85768b207 , here's 20 most frequently occurring passwords there with similar characteristics:

7 are present in your mystery list, 13 aren't. Considering that they all appear on the list above such fashionable hits as "jellybean1" and "iloveyou11", I am even more convinced these are bulk password resets and/or botnet passwords.

ignis-sec commented 4 years ago

Hello!

"tgPw53j3kG" shows up 4354 times in HIBP "odz1w1rB9T" appears 3769 times "ZZ8807zpl" appears 7508 times

Those amounts are way, way higher than what i was anticipating :) I'm currently running a quick scan checking:

I'll let you know about the results as soon as i can.

Cheers!

ignis-sec commented 4 years ago
  • emails and which leak those credentials containing those 3 passwords from.

    • This will let us see if there is a recognizable pattern to the email addresses, which would imply them being a spam farm and such.
    • We'll also see if this was from a single dump, which could imply these were test accounts in someone's database.

So first query is over. I'll add a small sample of it below, but i've redacted the email addresses because there are no predictable patterns implying they were auto-generated spam, and may contain personal information.

They are also from a wide variety of dumps, telling us its not a single database filled with those.

E.g., in the case of the passowrds above the first one shows up primarily next to gmail.com addresses in my (very limited) dataset

I've found 2135 credentials in my dataset with the password tgPw53j3kG. 1895 of those were @gmail.com. image

ignis-sec commented 4 years ago

It looks like you are absolutely right!

These accounts DID HAVE normal passwords at once.

Here is a small sample of all passwords used by accounts, which at some point appeared in dumps with password tgPw53j3kG

image

It looks like you are absolutely correct about the mass account takeovers. Someone claimed all these accounts and set a single password for all of them.

oleyka commented 4 years ago

My rough understanding is as follows: Multiple passwords for the same email could simply indicate that the collection contains records from different Internet services, not that the passwords were reset. People do not necessarily reuse their passwords everywhere. However, if all the other passwords of the corresponding email follow a different pattern, that is likely an indication that the password was not set by the original owner, but was a takeover. And that is what you saw with your analysis. Some of the occurences of these 10-characted passwords that I've seen were supposedly from a PayPal credential dump...

I am wondering if those 10-character passwords are set to match the corresponding botnet customers' credentials... ;) That would make sense, right? A customer purchases a set of N accounts, the bot farmer (not sure what the right term is, this is not quite my field) sets the customer's password on all of them, and then they can start abusing the accounts. Once the customer is done with their scam project, they forget about those accounts and eventually the credentials get leaked in yet another data dump.

oleyka commented 4 years ago

Haha, you know what I have also noticed? Sometimes the same gmail address is used to create multiple accounts on some services, sharing the same password. E.g. these would all be different accounts, linked to the same email:

That would artificially increase the occurence of a particular password in the various password dumps.

In the very small dataset that I could find, this password "3rJs1la7qE" (#256 on TroyHunt's list) gave some eye-opening results. Check it out with yours!

ignis-sec commented 4 years ago

Haha, you know what I have also noticed? Sometimes the same gmail address is used to create multiple accounts on some services, sharing the same password. E.g. these would all be different accounts, linked to the same email:

my.email.address@gmail.com myemailaddress@gmail.com m.y.e.m.a.i.l.a.d.d.r.e.s.s@gmail.com

Yup, i've noticed that and filtered out email addresses using multiple dots to create multiple accounts earlier. Relevant tweet: https://twitter.com/ahakcil/status/1277170571944628225

I've checked out 3rJs1la7qE, and in my dataset it doesn't have examples m.y.e.m.a.i.l.a.d.d.r.e.s.s@gmail.com - just because those got cleaned up earlier. However, accounts using this password seem to be auto-generated, spam accounts without an actual owner.

I've checked a few of them and these email addresses are not registered.

Some examples using 3rJs1la7qE:

bgbjktk@gmail.com
bgbtekq@gmail.com
bgbtzyc@gmail.com
bgbxefo@gmail.com
bgbxzgu@gmail.com
bgcbfxj@gmail.com
bgcbfxj@gmail.com
bgcdxyf@gmail.com
bgcdxyf@gmail.com
bgcfyxz@gmail.com
bgcqjzi@gmail.com
bgdiqcw@interia.pl
bgdiqcw@yahoo.com
bgdtxdr@gmail.com
bgdtxdr@gmail.com
bgeattr@gmail.com
bgeattr@gmail.com
bgedgkc@gmail.com
bgedgkc@gmail.com
bgedgkc@gmail.com
JulianVolodia commented 4 years ago

Haha, you know what I have also noticed? Sometimes the same gmail address is used to create multiple accounts on some services, sharing the same password. E.g. these would all be different accounts, linked to the same email: my.email.address@gmail.com myemailaddress@gmail.com m.y.e.m.a.i.l.a.d.d.r.e.s.s@gmail.com

Yup, i've noticed that and filtered out email addresses using multiple dots to create multiple accounts earlier.

Yes. It's done because many ppl used to make phishing addresses in Gmail afaik. More over - this is not RFC compilant to remove that dots. So why add them? The feature is to have less tracing on how you filter out your message boxes/see from where your creds blushed away their smile to anybody seeing db drops... and many services remove comment from our beloved content of mailto URIs so myaddr+newsletter@would be rather removed always. Actually haven't seen any spamming bots including dots to spam the gmail addresses tbh.

Shoutout: Nice set, thanks! :)))