Open steveatinfincia opened 7 years ago
I believe the unicode-normalization crate provides this as UnicodeNormalization:nkfd
.
I've been working on adding in NFKD normalization, need reliable test vectors in non-English languages. (I already have a Japanese set)
I found some in the NBitcoin project. NBitcoin/NBitcoin. https://github.com/MetacoSA/NBitcoin/tree/master/NBitcoin.Tests/data
Nice find @wigy-opensource-developer!
The tests there were generated with https://github.com/nym-zone/easyseed
Maybe this could be an interesting codefix: Not normalized input for Japanese phrases to test normalization: https://github.com/bip32JP/bip32JP.github.io/commit/360c05a6439e5c461bbe5e84c7567ec38eb4ac5f (I do not speak Japanese, so I would need to rely on these to make test vectors myself :blush: )
BIP-0039 suggests it needs to be applied in two situations:
When generating the wordlists
The standard says this:
This should be taken care of because the wordlist in bip39-rs is from the BIP-0039 repo and has already been processed correctly.
When turning a mnemonic phrase into a seed
The standard says this:
To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).
We currently make no attempt to follow this and should.