Open hilvmason opened 2 years ago
@rekmarks MetaMask currently uses https://github.com/MetaMask/bip39, a fork of https://github.com/bitcoinjs/bip39. I believe this fork was needed to support one of the security fixes to https://medium.com/metamask/security-notice-extension-disk-encryption-issue-d437d4250863
Should we all use that one?
cc @Gudahtt
2 options:
@scure/bip39 used in snaps is smaller, supports tree shaking, and works on byte arrays.
It has also been audited and is used in EF's repo ethereum-cryptography.
It looks like that library still accepts the mnemonic as a string, and it still returns the mnemonic as a string (e.g. here and here). Using this library would mean abandoning our efforts to keep the "mnemonic as a string" out of memory. For context, we're avoiding using the mnemonic as a string in memory as a way to prevent low-effort local attack vectors that involve dumping browser memory and searching for a mnemonic-like string.
low-effort local attack vectors that involve dumping browser memory
low-effort
dumping browser memory
Seems like a joke to me.
As a side note: making @scure/bip39
not use mnemonics takes 2 lines of code. But i'm sure you're very wrong here if you are doing this and I can describe why.
@paulmillr all of the concerns Mark are raising are due to actual vulnerabilities that were disclosed to us and patched. Mark is not attacking your work or the package, everything he's saying is purely out of concern for the safety of our users. The benefits of your package are clear, and I'm sure that we can make any modifications necessary. Let's keep a respectful tone here.
Okay, that makes sense. The reason I had this tone is that "dumping browser memory" meant extremely complex attacks which cannot be easily tested against, also garbage collector provides no guarantees as for when any secrets are removed from the memory — so protecting against something like this is premature optimization.
This one however is more like "browser session restore attacks" and seems pretty common. Sorry for that!
making @scure/bip39 not use mnemonics takes 2 lines of code
@paulmillr could you elaborate a bit what you mean by this?
I'm attempting to patch a fork of @scure/bip39
to meet our requirements that the mnemonic never be in memory as a plain string. Doesn't seem like we can use the Coder
interface for mnemonicToEntropy()
or entropyToMnemonic()
, since it returns the array of strings for encode()
and only expects an array of strings for decode()
...
@adonesky1 what is your flow? What happens when a user enters phrase into the input? In what format you store it?
So currently our flow is, from the user inputing the seedphrase:
Which stores it as a class variable as a Uint8Array.
Skipped a few steps in that flow, but should give you the idea. It is briefly in string form, until we encode it to an array of numbers and then ship it to the background, and store it as a Uint8Array.
Does this answer your question?
It should work like this:
// words is an array of words here
export function mnemonicToEntropy(mnemonic: string, wordlist: string[]): Uint8Array {
const { words } = normalize(mnemonic);
const entropy = getCoder(wordlist).decode(words);
assertEntropy(entropy);
return entropy;
}
// you can remove alphabet and expect array of numbers/indexes which represent wordlist[index]
function getCoder(wordlist: string[]) {
if (!Array.isArray(wordlist) || wordlist.length !== 2 ** 11 || typeof wordlist[0] !== 'string')
throw new Error('Worlist: expected array of 2048 strings');
wordlist.forEach((i) => {
if (typeof i !== 'string') throw new Error(`Wordlist: non-string element: ${i}`);
});
return baseUtils.chain(
baseUtils.checksum(1, calcChecksum),
baseUtils.radix2(11, true),
baseUtils.alphabet(wordlist)
);
}
// you can receive array of words here
function normalize(str: string) {
const norm = nfkd(str);
const words = norm.split(' ');
if (![12, 15, 18, 21, 24].includes(words.length)) throw new Error('Invalid mnemonic');
return { nfkd: norm, words };
}
@paulmillr I think we want to be able to pass the mnemonic to Coder.decode()
in mnemonicToEntropy
in buffer form too. And then keep the mnemonic in buffer form during the implementation of entropyToMnemonic
Everything is doable. Just requires more work. The code i've mentioned is the simplest one that can be done with just a few changes — where you will pass wordlist indexes, which is just 12/24 uint16 words. See:
const phrase = 'length divorce zoo anger act cave alter art device emerge caution vicious'
const indexes = phrase.split(' ').map(word => wordlistEn.indexOf(word));
// [ 1024, 512, 2047, 69, 19, 294, 59, 102, 485, 581, 293, 1948 ];
// Can be trivially converted to uint8array and back
const u8a = new Uint8Array(new Uint16Array(indexes).buffer);
// Uint8Array(24) [
// 0, 4, 0, 2, 255, 7, 69, 0,
// 19, 0, 38, 1, 59, 0, 102, 0,
// 229, 1, 69, 2, 37, 1, 156, 7
// ]
// Other stuff
// HEX for json
const hex = bytesToHex(u8a);
// '00040002ff074500130026013b006600e501450225019c07'
const base_64 = base64.encode(hex);
// 'AAQAAv8HRQATACYBOwBmAOUBRQIlAZwH'
// AES encryption, will add 28 bytes to the output.
const encrypted = await aes.encrypt(some_static_metamask_storage_key, u8a);
// Convert back
const recoveredIndexes = Array.from(new Uint16Array(u8a.buffer));
const recoveredPhrase = recoveredIndexes.map(i => wordlistEn[i]).join(' ');
It would take much less space to store it than the current approach. And it can also be easily changed into mnemonic phrase.
Mnemonics could be stored as json/integer array, a hex string, or base64 string. You can also encrypt it.
@paulmillr would you mind taking a quick look at this first-pass PR I've made to our fork of @scure/bip39
to allow us to pass Buffers and Uint8Arrays to the BIP39 functions we need. I've tested it and it seems to be working for our needs, but it would be awesome if you could gut check it for me.
cc @kumavis @rekmarks
Final pieces here:
snaps-monorepo
version and pull that into the extension which includes the commit made here
The Snaps team recommends using
@scure/bip39
, as that's what we use in@metamask/key-tree
.