Investigate use of NFKD

steveatinfincia commented 7 years ago

BIP-0039 suggests it needs to be applied in two situations:

When generating the wordlists

The standard says this:

The wordlist can contain native characters, but they must be encoded in UTF-8 using Normalization Form Compatibility Decomposition (NFKD).

This should be taken care of because the wordlist in bip39-rs is from the BIP-0039 repo and has already been processed correctly.

When turning a mnemonic phrase into a seed

The standard says this:

To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).

We currently make no attempt to follow this and should.

burdges commented 5 years ago

I believe the unicode-normalization crate provides this as UnicodeNormalization:nkfd.

QuestofIranon commented 5 years ago

I've been working on adding in NFKD normalization, need reliable test vectors in non-English languages. (I already have a Japanese set)

wigy-opensource-developer commented 5 years ago

I found some in the NBitcoin project. NBitcoin/NBitcoin. https://github.com/MetacoSA/NBitcoin/tree/master/NBitcoin.Tests/data

QuestofIranon commented 5 years ago

Nice find @wigy-opensource-developer!

QuestofIranon commented 5 years ago

The tests there were generated with https://github.com/nym-zone/easyseed

wigy-opensource-developer commented 5 years ago

Maybe this could be an interesting codefix: Not normalized input for Japanese phrases to test normalization: https://github.com/bip32JP/bip32JP.github.io/commit/360c05a6439e5c461bbe5e84c7567ec38eb4ac5f (I do not speak Japanese, so I would need to rely on these to make test vectors myself :blush: )

infincia / bip39-rs

Investigate use of NFKD #1

When generating the wordlists

When turning a mnemonic phrase into a seed