trezor / python-shamir-mnemonic

MIT License
165 stars 59 forks source link

BIP-0039 / SLIP-0039 integration #40

Closed alandefreitas closed 2 years ago

alandefreitas commented 2 years ago

It would obviously be great if we could also use Shamir's secret to split BIP-0039 seeds directly.

Of course, SLIP-0039 says this could happen

only at the price of all SLIP-0039 shares being 59 words long regardless of the length of the original because the 512 bit seed which is what would need to be split using SLIP-0039.

Furthermore, anyone who is using several different passphrases with one BIP-0039 mnemonic to have several wallets can convert only one of these wallets to SLIP-0039 shares.

Are these points expanded anywhere? I'm probably missing something but, from my ignorant point of view, both statements seem unjustified as they are.

A 12-word BIP-0039 mneumonic encodes 16 bytes (128 bits) + 4 checksum bits of information before the passphrase comes in.

It seems like the SLIP-0039 master secret already also encodes 16 bytes by default:

> shamir create 3of5
Using master secret: 42818ca31b45a696cced1ea399273aca
[...]

So considering:

we could just

> shamir create 3of5 --mnemonics='word1 word2 word3 word4 ...
Using master secret: (first 16 bytes of bip39-secret in hex as above)
[...]
(work as usual)

which generates 20 words long secrets rather than

being 59 words long

I implemented that in this fork: https://github.com/alandefreitas/python-shamir-bip39

And then recovering the BIP-0039 seed seems like it would all just be a matter of:

And in this case, users can just remember their

several different passphrases

as they are usually expected to and recover any of them and not

only one of these wallets to SLIP-0039 shares

All of this sounds just a little too easy, so there's probably another requirement I'm missing. But well, maybe something could be improved in SLIP-0039 anyway because there's nothing in there specifically justifying these statements and other people might have the same question.

prusnak commented 2 years ago

Please read Design Rationale 9 in SLIP39: https://github.com/satoshilabs/slips/blob/master/slip-0039.md#Bip39Compatibility

There we explain why we purposely decided to not include that.

alandefreitas commented 2 years ago

I did read it. In fact, it's exactly the quote I included in this issue.

The issue is not that I couldn't find this section. The issue is the rationale in this section is wrong or at best not well justified and this causes problems:

I included the implementation proving that it is possible.

Maybe there's some problem in my implementation I'm not aware of and maybe this problem is a good reason not to do it. In any case,

matejcik commented 2 years ago

here is the problem you are missing:

A 12-word BIP-0039 mneumonic encodes 16 bytes (128 bits) + 4 checksum bits of information before the passphrase comes in.

A 12-word mnemonic does not encode anything. It is true that it is reversibly generated from 128 bits of entropy plus checksum. The issue is what comes next: in order to derive the key from the mnemonic, you run it through a hash function, to obtain a 512-bit root secret.

The initial entropy is never used.

This is in contrast to SLIP-39, which reversibly encrypts the master secret itself.

In order to implement your scheme, a wallet would need to:

  1. read the SLIP-39 shares
  2. decrypt the master secret
  3. use the master secret as the initial entropy for BIP-39
  4. generate text of the mnemonic from the BIP-39 wordlist
  5. hash this string to obtain the root secret of the BIP-32 tree.

While it is definitely possible to do it this way, as your implementation proves, SLIP-39 is not meant as a layer on top of BIP-39, and does not intend to rely on implementation details of BIP-39 (whose design is arguably flawed).

It would also mean that you now have two passphrases in play: the SLIP-39 passphrase which can cause you to produce a different BIP-39 mnemonic, AND the BIP-39 passphrase that is inserted in the middle.

As written, SLIP-39 can only be used to encode the 512-bit root secret, i.e., the result of hashing together the mnemonic string and the passphrase. This is also the reason why a SLIP-39 encoding of the root secret would imply precisely the one and only BIP-39 passphrase.

alandefreitas commented 2 years ago

@matejcik Thanks! That was useful.

A 12-word mnemonic does not encode anything

This is true in a very limited sense.

It is correct that "in order to derive the key from the mnemonic, you run it through a hash function, to obtain a 512-bit root secret", but the "protocol-level" or "personal" decision (let's call it that) to throw the original mnemonic away does not mean the mnemonic you just threw away doesn't encode anything in any mathematical sense. The 512 bits in the seed still have mathematically only 128 bits of information if you have the mnemonics, which you should.

For instance, seedtool-cli allows you to convert from bip39 encoding to hex and back all you want. Each group of bits in bip39 matches the corresponding group of bits in a random string. This is literally what "encoding" means. And this implemented all around.

The initial entropy is never used

Of course it's used. It's represented as words and it's used when "you run it through a hash function, to obtain a 512-bit root secret" and it's used when people store it so they can "run it through a hash function, to obtain a 512-bit root secret". It's what people are storing anyway.

If anything, it would be easier and equally as bad to say the "512-bit root secret" is never used. It's not what people are storing, you can regenerate it from the word list at any time, and it's not what the blockchain needs. Of course, saying "it's never used" would make just as little sense.

In order to implement your scheme, a wallet would need to:

Exactly that.

While it is definitely possible to do it this way, as your implementation proves, SLIP-39 is not meant as a layer on top of BIP-39, and does not intend to rely on implementation details of BIP-39 (whose design is arguably flawed).

This is precisely the point. SLIP-39 does not make it clear at all. Quite to the contrary.

If this is an explicit decision not to support or "rely on implementation details" of BIP-39, that's fine. If this is an explicit decision to not be "a layer on top of BIP-39" because of its design flaws, that's also fine.

The problem is that this is not what SLIP-39 says. SLIP-39 says nothing about "compatibility with BIP-0039" being a bad thing. It just says this is provably not possible, which from the way the text is structured pretty much sounds like you wanted this compatibility but unfortunately can't.

SLIP-39 says this is possible "only at the price of all SLIP-0039 shares being 59 words long" precisely when it's talking about BIP-0039. This is not correct if we encode the mnemonic directly. Like the implementation I provided shows. seedtool-cli also provides this option to split seeds.

The reason it gives for that is that "the mnemonic and passphrase are processed by PBKDF2-SHA-512 to produce a 512 bit seed" but it provides no argument for why we can't split the mnemonic directly, like seedtool-cli and the implementation I provided do, for instance. It certainly does not mention as a reason that this is an explicit design choice because BIP-0039's "design is arguably flawed" at all. It just says it's not possible.

This, of course, also invalidates the following statement that "anyone who is using several different passphrases with one BIP-0039 mnemonic to have several wallets can convert only one of these wallets to SLIP-0039 shares".

Just to show how these assumptions are problematic, they lead to the conclusion that "Users who wish to take advantage of Shamir's secret sharing are advised to transfer their funds from their old BIP-0039 wallet to a new wallet backed-up using SLIP-0039.". So users who are happy with the "design flaws" of BIP-0039 would pay fees and maybe buy new steel plates for no reason.

It would also mean that you now have two passphrases in play: the SLIP-39 passphrase which can cause you to produce a different BIP-39 mnemonic, AND the BIP-39 passphrase that is inserted in the middle.

This is also a design choice. Not if you don't include the extra SLIP-39 passphrase, which you shouldn't in this case anyway. In the implementation I included, it throws an error if the user provides a BIP39 mnemonic and a (SLIP-39) passphrase as input.

As written, SLIP-39 can only be used to encode the 512-bit root secret, i.e., the result of hashing together the mnemonic string and the passphrase. This is also the reason why a SLIP-39 encoding of the root secret would imply precisely the one and only BIP-39 passphrase

The key here seems to be "SLIP-39 can only be used to encode the 512-bit root secret" as written. This is a choice, while SLIP-0039 makes it sound like it's an impossibility "only at the price of all SLIP-0039 shares being 59 words long".

prusnak commented 2 years ago

Your post still ignores the fact that such conversion from bip39 to slip39 screws up the passphrase handling completely.

alandefreitas commented 2 years ago

Your post still ignores the fact that such conversion from bip39 to slip39 screws up the passphrase handling completely.

It doesn't.

I just said:

This is also a design choice. Not if you don't include the extra SLIP-39 passphrase, which you shouldn't in this case anyway. In the implementation I included, it throws an error if the user provides a BIP39 mnemonic and a (SLIP-39) passphrase as input.

among some other things I said about the passphrase before.

If you don't want to mix the passphrases, then SLIP-39 shouldn't have its own passphrase if/when it's encoding BIP39 because BIP39 already has its passphrase and this might be confusing. SLIP-39 would focus on the encoding in this case. Maybe this is still technically possible but not possible for SLIP-39 because SLIP-39 is already invested in this design choice. Maybe some other SLIP would be more appropriate. In any case, this only reinforces the point that this is a design choice and not an impossibility as SLIP-39 describes.

Maybe that's a good point. Maybe that's a bad point. But it doesn't ignore the point even if you keep saying it does.

matejcik commented 2 years ago

This is true in a very limited sense. (...) This is literally what "encoding" means. And this implemented all around. (...) "SLIP-39 can only be used to encode the 512-bit root secret" as written.

I believe we're basically in agreement about actual facts of the matter, and the rest is semantics. So here's my argument from semantics:

Both BIP-39 and SLIP-39 are standards for backing up BIP-32 root seeds.

BIP-39 is a specification that prescribes generating a mnemonic code and converting said code to a binary seed. It's not an encoding specification. The reversible entropy<->words encoding is implied, but the spec is very clear that the "binary seed" is the thing that is supposed to be the result.

SLIP-39 is a specification of implementation of Shamir's secret-sharing (SSS) and a specification for its use in backing up Hierarchical Deterministic Wallets Similar to BIP-39, it implies a reversible words<->bytes encoding. It also very explicitly specifies that the master secret is the thing you use as the BIP-32 root seed. (and not, for example, BIP-39 initial entropy)

You can of course take out the mnemonic and seed generation part out of BIP-39, and the Shamir splitting and mnemonic encoding parts of SLIP-39, plug them into each other, specify that SLIP-39 passphrase will be empty and the BIP-39 passphrase will be used, and have a thing that works very nicely.

That thing however is neither BIP-39 nor SLIP-39. It's a new separate standard.

To the original post, there is no fundamental technical reason why it wouldn't work. You are absolutely free to write that standard. This repository however is a reference implementation of SLIP-39, so not an ideal place to start implementing a different standard.

To your complaint that SLIP-39 rationale about BIP-39 compatibility feels incomplete/incorrect

While it is definitely possible to do it this way, as your implementation proves, SLIP-39 is not meant as a layer on top of BIP-39, and does not intend to rely on implementation details of BIP-39 (whose design is arguably flawed).

This is precisely the point. SLIP-39 does not make it clear at all. Quite to the contrary.

You are right in a sense, but SLIP-39 is very clear that it is "mainly intended as a replacement for BIP-39". You might want to raise a point that there is no rationale for "why we decided NOT to plug this into BIP-39", or "why we decided to replace BIP-39 instead of building on top of it." But that decision is a given for SLIP-39, and the "Compatibility" section only discusses compatibility in context of this decision.

alandefreitas commented 2 years ago

You can of course take out the mnemonic and seed generation part out of BIP-39, and the Shamir splitting and mnemonic encoding parts of SLIP-39, plug them into each other, specify that SLIP-39 passphrase will be empty and the BIP-39 passphrase will be used, and have a thing that works very nicely.

That thing however is neither BIP-39 nor SLIP-39. It's a new separate standard.

OK. We are in agreement here.

To the original post, there is no fundamental technical reason why it wouldn't work. You are absolutely free to write that standard. This repository however is a reference implementation of SLIP-39, so not an ideal place to start implementing a different standard.

OK. We are also in agreement here. It's clear this is an explicit design choice of SLIP-0039 and this repository is intended to implement SLIP-0039 and nothing else, which is more than reasonable, so we are in agreement.

That's so reasonable that I opened this more as a question than as an issue. I wouldn't open an issue if github discussions were open, for instance. The only small problem here is the first answers I got were very different from what you just said. But that's fine. We are in agreement now.

You are right in a sense, but SLIP-39 is very clear that it is "mainly intended as a replacement for BIP-39". You might want to raise a point that there is no rationale for "why we decided NOT to plug this into BIP-39", or "why we decided to replace BIP-39 instead of building on top of it." But that decision is a given for SLIP-39, and the "Compatibility" section only discusses compatibility in context of this decision.

OK. 100% in agreement. It's a design choice and this design choice is a given.

only discusses compatibility in context of this decision

Exactly. I understand it now.

I misunderstood it before because it's just unfortunate that by (i) describing it as a replacement only in the abstract, (ii) considering it as a given precisely in the section that discusses BIP-0039, and (iii) not providing a rationale for that, (c) the document, unfortunately, seems to imply this is an impossibility and not a design choice.

For instance, I've seen dozens of discussions online about SLIP-0039 (such as this one and many others on reddit, here on github, etc...) where everyone mostly just assumes this is an impossibility rather than a design choice and someone always links to this section about BIP-0039 to justify it.

Regardless of how we got to this point, BIP-0039 is supported almost everywhere. This is unfortunate because people are assuming lots of things all around and missing lots of opportunities. For instance, I've seen lots of blog posts about hardware wallets and how it is impossible (not a design choice) to split seeds without a trezor wallet because other wallets only support BIP-0039 and this would make each share 59 words long.

You might want to raise a point that there is no rationale for

Yes. That's exactly the point. It seems like the lack of a rationale ended up being (not purposefully, of course) misleading to many other discussions. At least I understand the rationale, the pros and cons now. So it is what it is.

Thanks @matejcik !