satoshilabs / slips

SatoshiLabs Improvement Proposals
Creative Commons Attribution Share Alike 4.0 International
1.48k stars 1.69k forks source link

SLIP-0039: Suggestions on SSS Mnemonics #378

Closed ChristopherA closed 5 years ago

ChristopherA commented 5 years ago

Re: https://github.com/satoshilabs/slips/blob/master/slip-0039.md

(copied from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-September/016418.html)

I and a number of companies & communities I am involved with are very interested in this.

A challenge is that Shamir Secret Sharing has subtleties. To quote Greg Maxwell:

I think Shamir Secret Sharing (and a number of other things, RNGs for example), suffer from a property where they are just complex enough that people are excited to implement them often for little good reason, and then they are complex enough (or have few enough reasons to invest significant time) they implement them poorly”.

Some questions for you:

Two comments:

Among the results of this was a new BIP-39 2048 word compatible word list filtered for memorability (concreteness & emotional valence) and suitability for iambic pentameter, which is located:

https://github.com/ChristopherA/iambic-mnemonic/blob/master/word-lists/iambic-wordlist.json

…which was created from the repo at

https://github.com/ChristopherA/password_poem

You can a number of other word lists that I’ve collected here https://github.com/ChristopherA/iambic-mnemonic/blob/master/word-lists/

If you want to replicate what we did with your own criteria, you may want to incorporate information from the CMU dictitionary http://www.speech.cs.cmu.edu/cgi-bin/cmudict, the top 5000 words https://github.com/ChristopherA/password_poem/blob/master/top5000.json, concrete word lists http://crr.ugent.be/papers/Concreteness_ratings_Brysbaert_et_al_BRM.txt and emotional words (valence) http://crr.ugent.be/archives/1003

prusnak commented 5 years ago

What other teams or communities besides Trezor are committed to standardizing a Shamir Secret Sharing Scheme?

So far none, but I would expect that all hardware wallets would follow the suit like they did in the past with BIP39.

Where do you want to hold discussions on this?

Both mailing list and github issues are fine.

to have a birthday in the seed

They might make sense for lightning and/or SPV nodes, but not for Trezor. We anticipate this standard to be used 10-20+ years into the future and for timespan like this, storing birthday is very small optimization. It also leaks information, which user might want to keep private.

Wordlist is yet to be defined by @onvej-sl and @andrewkozlik, so I let them to have a look at your wordlist efforts.

devrandom commented 5 years ago

Base Zero is very interested in this, but for key backup purposes only. We don't see SSS as suitable for storing value without a multisig layer, because there is a single point of security failure at the point of reassembly. We currently have our own solution, but would prefer to follow a standard.

andrewkozlik commented 5 years ago

Regarding the wordlist:

The main purpose of the scheme under consideration is to create long-term backups of the master secret which will usually be needed only if the device holding the master secret is destroyed or the user wishes to migrate to another device. Memorability of the mnemonic is therefore not our main objective as there are probably few people who would be able to memorize such a mnemonic for a period of, say, several years if they are not required to recall it on a regular basis. Nevertheless, we will look at your wordlists and see what we can do in this regard. Our main objective is to make the words in the wordlist sufficiently different from each other in terms of pronunciation and edit distance including readability issues, such as "cl" <-> "d", because the mnemonics are expected to be hand-written down on a piece of paper. Another criterion is easy entry (all words begin with a unique 4-letter prefix). We are also focusing on selecting common English words, but we hadn't considered difficulty of pronunciation for non-native speakers. We will add that to our criteria.

I think the iambic pentameter poetry would be a marvelous tool for generating good passphrases, since these are intended to be remembered and not written down.

andrewkozlik commented 5 years ago

Regarding attack vectors:

The assumption is that the mnemonics will be entered directly into a trusted device, which will compute the master secret. If some of the mnemonics are transmitted over an insecure channel, e.g. via a wiretapped telephone to the person entering them into the trusted device, then this alone does not compromise the confidentiality of the master secret. Firstly, at least T shares are required to reconstruct the pre-master secret, any T-1 or fewer shares do not reveal any information. Secondly, even if T shares are compromised, then there is still the passphrase to protect the master secret.

A DOS attack is always possible. If an attacker gives the user a fake share to prevent them from reconstructing their master secret, then the user will be able to detect that something went wrong, because the reconstructed HD wallets will not contain any funds. Adding another checksum would make this detection automated, but it would increase the length of the mnemonics, which doesn't seem worth it.

There is one dangerous attack that we are explicitly mitigating against which is explained in the Design Rationale section https://github.com/satoshilabs/slips/blob/master/slip-0039.md#IndexEncoding.

keepkeyjon commented 5 years ago

What other teams or communities besides Trezor are committed to standardizing a Shamir Secret Sharing Scheme?

I'd love to see one standardized and widely implemented across various vendors. It's difficult to have to tell people that their backup plans to split up BIP39 wordlists are not secure, but also to not have a well supported alternative.

howech commented 5 years ago

I made a similar comment in a close issue somewhere, but this open discussion might be a better forum.

I was not aware of SLIP-0039 until this morning, but over the weekend I implemented something similar in a trezor-T emulator (see the 'ssss' branch of https://github.com/howech/trezor-core.git and the 'allow_15_21_mnemonic_length' branch of https://github.com/howech/trezor-crypto.git.)

My implementation differs from the proposal in a couple of ways, but agrees in some other ways. Agreements: family info, threshold size and shamir index included in preamble metadata. Disagreements: my implementation does not include any PRP to protect against partial share attacks, my implementation does not implement any extra error checking, and my implementation just uses BIP39 keywords, encoding and seed reconstruction.

The main thing that my implementation lacks, and that I would hope a real implementation would include, is the ability to reconstruct the secret over multiple sessions - by bringing the device to the shares rather than by collecting the shares together physically at the hardware in a single session.

hatgit commented 5 years ago

I think this proposal is very interesting and also complex (just came across it yesterday). I was comparing it to Ian Coleman's version and trying to reconcile the differences. Question: If a share is 1024-bits (based on some secret 64hex string acting as the pre-image), how many words would that translate to after applying this standard? Wouldn't it be better to use the existing 2048-word wordlist to encode the bits versus using yet another wordlist? Lastly, are there any other proposals that could further condense the final length of the mnemonic into fewer words without reducing security or information loss?

prusnak commented 5 years ago

@hatgit 1024 bits would translate into 103 words when you use a wordlist of 1024 words = 10 bits per word. 1024 bits would translate into 94 words when you use a wordlist of 2048 words = 11 bits per word (current bip39). As you can see from the values, you don't want to encode this much entropy into words no matter what wordlist you use.

hatgit commented 5 years ago

@prusnak thanks for the feedback! I was thinking that too and how even going to 4096 words does not offer much gains. So in this proposal how many words are being proposed say for a 12-word mnemonic that uses 128 bits and the 4-bit deterministic checksum? Would it look like this prototype that Ian Coleman is hosting:

screen shot 2018-11-30 at 8 49 11 pm

Or would it be 20 words (was comparing to this excerpt from SLIP-0039):

screen shot 2018-11-30 at 8 50 40 pm
prusnak commented 5 years ago

@hatgit The latter.

ChristopherA commented 5 years ago

If there are going to be changes in word lists, I highly recommend a new set of words. For ease memory, at minimum the words should have high concreteness (i.e. not abstract), and valence (emotional connection). For non-native english speakers, there are word lists of words to avoid that they often have a difficult time pronouncing. You want to eliminate words that are homonyms. You also have some criteria for choosing words with greater hamming distance for first 3 or 4 letters. Finally, you may not only want to do some error-detecting, but some limited error-correcting, like bech-32 does. I have a repo with a folder for some of these kind of word lists: https://github.com/ChristopherA/iambic-mnemonic/tree/master/word-lists

prusnak commented 5 years ago

@ChristopherA yes, most of your suggestions were applied while creating the new word list

FelixWeis commented 5 years ago

have you tried getting to a wordlist with 3 unique letter prefix? maybe check google 1-grams http://storage.googleapis.com/books/ngrams/books/datasetsv2.html

if you can just type 3 letters to identify a word it makes recovery a lot faster.

prusnak commented 5 years ago

@FelixWeis 3 letters are not enough. We are working on easy recovery via T9-like keyboard (i.e. keyboard with 9 buttons only) and 3 letters provide only 1000 options (we need 1024 words).

prodnet commented 5 years ago

I would like to know if the library of Ian Coleman, is secure and if can be used in real situation? https://iancoleman.io/shamir39/ (Side-channels, Tamper-resistant ...etc)

andrewkozlik commented 5 years ago

I would like to know if the library of Ian Coleman, is secure and if can be used in real situation? https://iancoleman.io/shamir39/ (Side-channels, Tamper-resistant ...etc)

Ian Coleman's shamir39 library is not compatible with SLIP-0039.

prusnak commented 5 years ago

Closing. AFAIK every suggestion was discussed at #RebootingWebOfTrust event and good ones were implemented in the standard.