BlockstreamResearch / codex32

A paper computer for Shamir's Secret Sharing over the Bech32 alphabet.
78 stars 22 forks source link

Improved BIP39 Backwards compatibility #60

Closed BenWestgate closed 1 year ago

BenWestgate commented 1 year ago

Backwards Compatibility

codex32 is an alternative to BIP-0039 and SLIP-0039. It is technically possible to derive the BIP32 master seed from seed words encoded in one of these schemes, and then to encode this seed in codex32.

It's also technically possible to directly store their existing 12, 18 or 24 words in "BIP39 Backwards Compatibility Mode" and recover them transparently.

To create a new backwards compatible codex32 secret:

  1. 'ms1' remains unchanged.
  2. Choose a valid threshold.
  3. Choose the identifier 'SP39' to enable "BIP39 compatibility mode".
  4. Choose the share index 's'
  5. Convert the mnemonic to raw binary including the checksum.
  6. Append 4-bits to encode the BIP39 wordlist used.
  7. Set the payload to a bech32 encoding of this binary data, padded with arbitrary bits.
  8. Generate a valid checksum in accordance with the Checksum section

BIP39 Backwards Compatibility Mode:

  1. Identifier 'SP39' signals the wallet to perform backwards compatibility checks.
  2. If the first 132, 160, 192, 224 or 256-bits conclude with 4, 5, 6, 7, 8 bits of valid bip39 checksum, treat next 4-bits as an encoding of the word list used.
  3. Use 11-bit chunks of data preceding the wordlist to reproduce the original bip39 mnemonic.
  4. Ask the user for their optional passphrase.
  5. Import as usual for bip39.

I understand this was covered in our BIP:

On approach would be to encode the BIP-0039 entropy along with the BIP-0039 checksum data. This data can directly be recovered from the BIP-0039 mnemonic, and the process can be reversed if one knows the target language. However, for a 128-bit seed, there is a 4 bit checksum yielding 132 bits of data that needs to be encoded. This exceeds the 130-bits of room that we have for storing 128 bit seeds. We would have to compromise on the 48 character size, or the size of the headers, or the size of the checksum in order to add room for an additional character of data.

Typo On should be One.

  1. Target language is stored. So it's reversible.
  2. The checksum bits are useful to make electronic imports aware they're seeing BIP39 data, with exceedingly small chance it's an accident 1:8 million to 1: 256 million, and only among backups using an unusual string length. (it will never happen.) This enables the recovery to be transparent to users.
  3. I don't see the issue with the length being 50 characters. If anything it, combined with a fixed ident, makes it easier for software to detect the backwards compatible encoding.
  4. There are some storage devices that lack room for 48 characters, just drop 2 of the checksum or 'ms'. There's no chance of mistaking it for a p2wsh (40) or p2pkh address (60), which, you'd never put in metal. Our current import document would handle either shortening without issue.
  5. Existing bip39 mnemonics can be converted into this format and split into shares without electronics.

Edit: I see 'B' is not in charset, MN39 for "mnemonic" or SW39 for "seed words" or SP39 for "seed phrase" or RP39 for "recovery phrase"

apoelstra commented 1 year ago

It's probably worthwhile to spec this all out somewhere (I haven't read it carefully yet -- though my first comment is that we should replace ms because BIP39 words are not master seeds; I use bip39_12 and bip39_24 for my own stuff), but we made a deliberate decision not to include it in the BIP anywhere, both as a BIP39 protest and because we are not interested in maintaining any bip39 related code.

It's also a lot of extra complexity that can lead to user error -- in particular, having done the "convert to binary then to base 2^11" process by hand, I can say that it's a frequent source of errors and that you have no checksum to help you. So it winds up being the weak link in the entire process and undermines all the other work.

Maybe it could try to be "SLIP93" or something? It seems like the SLIPs repo is often the "home for specs that users want but purists don't".

BenWestgate commented 1 year ago

I have a total of zero funded bip39 seed backups and my project doesn't plan to support bip39 importing (or conversion either) so I have no need to complete this now. I'll offer some help if someone else wants this.

Do note the typo in the paragraph I quoted if you have a bip update planned