maciejhirsz / tiny-bip39

A Rust implementation of BIP-0039
Apache License 2.0
64 stars 37 forks source link

Rust crate gives different mnemonic derivation to the offical JS bip39 package #26

Closed CodeForcer closed 3 years ago

CodeForcer commented 3 years ago

Looking into the code the difference seems to be that the JS bip39 package uses:

const mnemonicBuffer = Buffer.from(normalize(mnemonic), 'utf8');

Whereas the Rust package uses the JS packages's conversionToEntropy logic (in the Rust package this is phrase_to_entropy) before proceeding with the derivation:

let mut bits = BitWriter::with_capacity(264);
for word in phrase.split(" ") {
    bits.push(wordmap.get_bits(&word)?);
}
let mtype = MnemonicType::for_word_count(bits.len() / 11)?;
let mut entropy = bits.into_bytes();
let entropy_bytes = mtype.entropy_bits() / 8;
entropy.truncate(entropy_bytes);

The results of both operations with the same phrase is different.

There can only be one correct implementation of bip39, so which package is incorrect?

maciejhirsz commented 3 years ago

Can you link to the JS implementation you are using?

~Also, the code you pasted for Rust is not matching the source code for tiny-bip39.~ NVM, I was looking at wrong method.

CodeForcer commented 3 years ago

The relevant code in the JS package is here: https://github.com/bitcoinjs/bip39/blob/master/src/index.js#L52

And the logic I was examining in this package was from the function called by the function you just linked: https://github.com/maciejhirsz/tiny-bip39/blob/6abc65afeb148671b0e1781b78564b2e16c958dd/src/mnemonic.rs#L187

The phrase to entropy logic is in the JS package as mnemonicToEntropy, but it's not used when creating a seed from mnemonic, unlike Rust which does use it.

maciejhirsz commented 3 years ago

So, both tiny-bip39 and the bip39 npm module use the same test vectors. There is no normalization in phrase_to_entropy since that's handled in from_phrase. How are you using them that gives you different results?

maciejhirsz commented 3 years ago

The phrase to entropy logic is in the JS package as mnemonicToEntropy, but it's not used when creating a seed from mnemonic, unlike Rust which does use it.

It doesn't actually.

CodeForcer commented 3 years ago

I was trying to investigate why the mnemonic derivations into keys using Polkadot give different keys to the derivations using the JS bip39 package. Digging into the code I first noticed the Polkadot JS packages wrap into a Rust library which calls: https://github.com/polkadot-js/wasm/blob/master/packages/wasm-crypto/src/rs/bip39.rs#L40:

#[wasm_bindgen]
pub fn ext_bip39_to_mini_secret(phrase: &str, password: &str) -> Vec<u8> {
    let salt = format!("mnemonic{}", password);
    let mnemonic = Mnemonic::from_phrase(phrase, Language::English).unwrap();
    let mut result = [0u8; 64];

    pbkdf2::<Hmac<Sha512>>(mnemonic.entropy(), salt.as_bytes(), 2048, &mut result);

    result[..32].to_vec()
}

I then examined the JS package which performs the same logic (turn mnemonic into bytes --> create salt with "mnemonic"+password, then run the hashing algorithms) but gives a different result:

function mnemonicToSeedSync(mnemonic, password) {
    const mnemonicBuffer = Buffer.from(normalize(mnemonic), 'utf8');
    const saltBuffer = Buffer.from(salt(normalize(password)), 'utf8');
    return pbkdf2_1.pbkdf2Sync(mnemonicBuffer, saltBuffer, 2048, 64, 'sha512');
}

I realised the discrepancy is because the Polkadot derivation is converting the mnemonic to entropy during this process, whereas the JS package is not. This is the reason that deriving private/public pairs from the JS BIP39 package and the Polkadot keyring package gives you different addresses

EDIT: When I modified the JS seed derivation to use the mnemonicToEntropy method then salt + hash I get the same results as the Rust packages. So I can get the same results but I'm not sure which one is the correct way to implement BIP39

maciejhirsz commented 3 years ago

Ok, I see where the confusion comes from.

BIP39 (the spec, not the crate or the npm package) is just defining the dictionaries and how to create mnemonics from entropy. It's BIP32 that does derivation for Bitcoin and Ethereum, which both tiny-bip39 and bip39 package provide for convenience.

However, Polkadot/Substrate does not use BIP32, what you want to use for Polkadot/Substrate is substrate-bip39.

CodeForcer commented 3 years ago

Awesome thanks for your help @maciejhirsz, you've cleared up all my confusion here