Closed roycewilliams closed 2 years ago
Hmm. Not a shabby idea. It does come up short of 20K however, but not by much:
> let wordList = []
> wordList = wordList.concat(alternatePgp)
> wordList = wordList.concat(alternatePokerware)
> wordList = wordList.concat(alternateWordle)
> wordList = wordList.concat(bitcoinEN)
> wordList = wordList.concat(dicewareNLP[0])
> wordList = wordList.concat(dicewareNLP[1])
> wordList = wordList.concat(effDistant)
> wordList = wordList.concat(effLong)
> wordList = wordList.concat(effShort)
> wordList = uniquesOnly(wordList)
> console.log(wordList.length)
18174
> const entropy = getEntropy()
> console.log(entropy)
70
> const len = Math.ceil(entropy / Math.log2(wordList.length)
> console.log(len)
5
> for (let i = 0; i < 10; i++) {
console.log(generatePass(len, wordList, true, false))
}
'serrated abuses guys fancy trousers'
'upload shuts quick verbose unlined'
'unknotted manna roost blemishes twisting'
'choice alack consent deafening uproar'
'escapade john owed limeade blur'
'trifles cave stride halos fez'
'shaped handheld salary curds unjam'
'irritabily gained chinese narrative caves'
'sables depravity encroach archer equals'
'python baler microwave chirps credits'
It would be useful to have an English list that balances these three goals:
I'm not sure of the best way to accomplish this. I'm envisioning an Alternate list that isn't "English (all)", but is instead a subset of the wordlists that make up that list. Perhaps called "English (common)" or "English (broad)" or "English (most)" or something like that. Specialized lists (like Star Trek, Simpsons, etc.) would be excluded.
Initial source lists might be:
... but with all capitalized words lower-cased (which would eliminate having to remember that words 2 and 5 are capitalized, etc., but that's up for discussion) and then deduplicated. My hope is that this will produce a list in the 20K range or higher.
Any future general (not topic-specific) English wordlists could also be added.