polkascan / py-substrate-interface

Python Substrate Interface
https://polkascan.github.io/py-substrate-interface/
Apache License 2.0
240 stars 114 forks source link

Bug with create_from_uri() on french mnemonic #218

Closed vtexier closed 2 years ago

vtexier commented 2 years ago

Version: 1.2.4

Keypair.create_from_uri() raise an exception on french mnemonic.

        suri_regex = re.match(r'^(?P<phrase>\w+( \w+)*)(?P<path>(//?[^/]+)*)(///(?P<password>.*))?$', suri)

>       suri_parts = suri_regex.groupdict()
E       AttributeError: 'NoneType' object has no attribute 'groupdict'

It seems that the regexp does not handle french special characters.

So I added a test in my local repo to confirm and the test raise an exception on french only, chinese is ok:

    def test_hdkd_multi_lang(self):
        mnemonic_path = "秘 心 姜 封 迈 走 描 朗 出 莫 人 口//0"
        keypair = Keypair.create_from_uri(mnemonic_path, language_code=MnemonicLanguageCode.CHINESE_SIMPLIFIED)
        self.assertNotEqual(keypair, None)

        mnemonic_path = "nation armure tympan devancer temporel capsule ogive médecin acheter narquois abrasif brasier"
        keypair = Keypair.create_from_uri(mnemonic_path, language_code=MnemonicLanguageCode.FRENCH)
        self.assertNotEqual(keypair, None)
arjanz commented 2 years ago

I suspect there is an issue with the multi-byte representation of the mnemonic above, because the mnemonic below works and seems the same, but is a different representation byte-wise:

    def test_create_mnemonic_multi_lang(self):
        mnemonic_path = "nation armure tympan devancer temporel capsule ogive médecin acheter narquois abrasif brasier"
        keypair = Keypair.create_from_uri(mnemonic_path, language_code=MnemonicLanguageCode.FRENCH)
        self.assertNotEqual(keypair, None)

I will have to look into a way to convert the multi-byte string to be usable in both scenarios.