berzerk0 / Probable-Wordlists

Version 2 is live! Wordlists sorted by probability originally created for password generation and testing - make sure your passwords aren't popular!
Creative Commons Attribution Share Alike 4.0 International
8.71k stars 1.61k forks source link

Are passwords for the same mail address deduplicated? #49

Closed domenukk closed 5 years ago

domenukk commented 5 years ago

Looking through recent leaks, I found mail:password combos that are contained particularly often. This, however, does not indicate it would be more commonly used. It should still count as a single occurrence. Is this taken into account?

berzerk0 commented 5 years ago

Yes.

In revisions 1 and 2, a password's popularity is determined based on the number of source files it appeared in, not how many times it appeared in each file. Each source file was stripped of all non-password information (as much as possible), and then had duplicates removed.