1200wd / bitcoinlib

Bitcoin and other Cryptocurrencies Library for Python. Includes a fully functional wallet, Mnemonic key generation and management and connection with various service providers to receive and send blockchain and transaction information.
http://bitcoinlib.readthedocs.io/
GNU General Public License v3.0
596 stars 199 forks source link

BIP39 passphrases get mangled if the string contains only hexadecimal characters #350

Closed stevenceasefire closed 9 months ago

stevenceasefire commented 9 months ago

Python Version: 3.11 Bitcoinlib Version: 0.6.13

I noticed bitcoinlib was producing an erroneous m/84'/0'/0'/0/0 address compared to both Ian Coleman's BIP39 tool and a Ledger Nano S Plus with certain BIP39 passphrases (e.g. '1980' and 'aaaa') but the correct one with others (e.g. 'bob' or 'az19') .

As an example, the below code produces bc1qnxfqmt673w5y4xzrwqg7pfua52jjs3rvhf3c0v as the first segwit address rather than bc1q2kl5udylkxrukq965ala8m9g48pa788ax4jx0x if you plug in the same information into https://iancoleman.io/bip39/ or onto a Ledger Nano. If you swap out '1980' for 'bob' as the BIP39 passphrase, all three will produce the same m/84'/0'/0'/0/0 address: bc1qajx32e42sgjye5xcm2pfq2geq6vftus05dfcap

from bitcoinlib.keys import HDKey

BIP39_PHRASE = 'used genius sand else oyster skin cloth click casual broccoli permit deputy soccer voyage differ envelope phone animal opinion ginger place forum garage wrong'
BIP39_PASS = '1980'
first_segwit_address = HDKey.from_passphrase(passphrase=BIP39_PHRASE, password=BIP39_PASS, witness_type='segwit').subkey_for_path("m/84'/0'/0'/0/0").address()
print(first_segwit_address)

I'm still a crayon eater when it comes to Python, but I think the issue is with this chunk of code in bitcoinlib's encoding.py:

try:
    if isinstance(string, bytes):
        string = string.decode()
    s = bytes.fromhex(string)
return s
except (TypeError, ValueError):
    pass

The line s = bytes.fromhex(string) will modify and return the BIP39 password '1980' as b'\x19\x80' and 'aaaa' as b'\xaa\xaa' but will exit the try/except block without touching it if it contains at least one non-hexadecimal character. This modified password then gets applied as part of the salt in mnemonic.py which then produces an entirely different master key.

return hashlib.pbkdf2_hmac(hash_name='sha512', password=mnemonic, salt=b'mnemonic' + password,
                                   iterations=2048)

...that's my theory anyway. Please let me know if I can offer any additional details.

Edit:

For my particular use case, changing the following block in mnemonic.py is working a temp fix since I'm not sending passwords as hexstrings.

Old

if validate:
    self.to_entropy(words)
mnemonic = to_bytes(words)
password = to_bytes(password)

New

if validate:
    self.to_entropy(words)
mnemonic = to_bytes(words)
password = to_bytes(password,unhexlify=False)
mccwdev commented 9 months ago

Good point, I will take a look

mccwdev commented 9 months ago

Fix in 771b9f1329e1bce332c119fc4614463887ad0d5a