Closed ChristopherA closed 5 years ago
What other teams or communities besides Trezor are committed to standardizing a Shamir Secret Sharing Scheme?
So far none, but I would expect that all hardware wallets would follow the suit like they did in the past with BIP39.
Where do you want to hold discussions on this?
Both mailing list and github issues are fine.
to have a birthday in the seed
They might make sense for lightning and/or SPV nodes, but not for Trezor. We anticipate this standard to be used 10-20+ years into the future and for timespan like this, storing birthday is very small optimization. It also leaks information, which user might want to keep private.
Wordlist is yet to be defined by @onvej-sl and @andrewkozlik, so I let them to have a look at your wordlist efforts.
Base Zero is very interested in this, but for key backup purposes only. We don't see SSS as suitable for storing value without a multisig layer, because there is a single point of security failure at the point of reassembly. We currently have our own solution, but would prefer to follow a standard.
Regarding the wordlist:
The main purpose of the scheme under consideration is to create long-term backups of the master secret which will usually be needed only if the device holding the master secret is destroyed or the user wishes to migrate to another device. Memorability of the mnemonic is therefore not our main objective as there are probably few people who would be able to memorize such a mnemonic for a period of, say, several years if they are not required to recall it on a regular basis. Nevertheless, we will look at your wordlists and see what we can do in this regard. Our main objective is to make the words in the wordlist sufficiently different from each other in terms of pronunciation and edit distance including readability issues, such as "cl" <-> "d", because the mnemonics are expected to be hand-written down on a piece of paper. Another criterion is easy entry (all words begin with a unique 4-letter prefix). We are also focusing on selecting common English words, but we hadn't considered difficulty of pronunciation for non-native speakers. We will add that to our criteria.
I think the iambic pentameter poetry would be a marvelous tool for generating good passphrases, since these are intended to be remembered and not written down.
Regarding attack vectors:
The assumption is that the mnemonics will be entered directly into a trusted device, which will compute the master secret. If some of the mnemonics are transmitted over an insecure channel, e.g. via a wiretapped telephone to the person entering them into the trusted device, then this alone does not compromise the confidentiality of the master secret. Firstly, at least T shares are required to reconstruct the pre-master secret, any T-1 or fewer shares do not reveal any information. Secondly, even if T shares are compromised, then there is still the passphrase to protect the master secret.
A DOS attack is always possible. If an attacker gives the user a fake share to prevent them from reconstructing their master secret, then the user will be able to detect that something went wrong, because the reconstructed HD wallets will not contain any funds. Adding another checksum would make this detection automated, but it would increase the length of the mnemonics, which doesn't seem worth it.
There is one dangerous attack that we are explicitly mitigating against which is explained in the Design Rationale section https://github.com/satoshilabs/slips/blob/master/slip-0039.md#IndexEncoding.
What other teams or communities besides Trezor are committed to standardizing a Shamir Secret Sharing Scheme?
I'd love to see one standardized and widely implemented across various vendors. It's difficult to have to tell people that their backup plans to split up BIP39 wordlists are not secure, but also to not have a well supported alternative.
I made a similar comment in a close issue somewhere, but this open discussion might be a better forum.
I was not aware of SLIP-0039 until this morning, but over the weekend I implemented something similar in a trezor-T emulator (see the 'ssss' branch of https://github.com/howech/trezor-core.git and the 'allow_15_21_mnemonic_length' branch of https://github.com/howech/trezor-crypto.git.)
My implementation differs from the proposal in a couple of ways, but agrees in some other ways. Agreements: family info, threshold size and shamir index included in preamble metadata. Disagreements: my implementation does not include any PRP to protect against partial share attacks, my implementation does not implement any extra error checking, and my implementation just uses BIP39 keywords, encoding and seed reconstruction.
The main thing that my implementation lacks, and that I would hope a real implementation would include, is the ability to reconstruct the secret over multiple sessions - by bringing the device to the shares rather than by collecting the shares together physically at the hardware in a single session.
I think this proposal is very interesting and also complex (just came across it yesterday). I was comparing it to Ian Coleman's version and trying to reconcile the differences. Question: If a share is 1024-bits (based on some secret 64hex string acting as the pre-image), how many words would that translate to after applying this standard? Wouldn't it be better to use the existing 2048-word wordlist to encode the bits versus using yet another wordlist? Lastly, are there any other proposals that could further condense the final length of the mnemonic into fewer words without reducing security or information loss?
@hatgit 1024 bits would translate into 103 words when you use a wordlist of 1024 words = 10 bits per word. 1024 bits would translate into 94 words when you use a wordlist of 2048 words = 11 bits per word (current bip39). As you can see from the values, you don't want to encode this much entropy into words no matter what wordlist you use.
@prusnak thanks for the feedback! I was thinking that too and how even going to 4096 words does not offer much gains. So in this proposal how many words are being proposed say for a 12-word mnemonic that uses 128 bits and the 4-bit deterministic checksum? Would it look like this prototype that Ian Coleman is hosting:
Or would it be 20 words (was comparing to this excerpt from SLIP-0039):
@hatgit The latter.
If there are going to be changes in word lists, I highly recommend a new set of words. For ease memory, at minimum the words should have high concreteness (i.e. not abstract), and valence (emotional connection). For non-native english speakers, there are word lists of words to avoid that they often have a difficult time pronouncing. You want to eliminate words that are homonyms. You also have some criteria for choosing words with greater hamming distance for first 3 or 4 letters. Finally, you may not only want to do some error-detecting, but some limited error-correcting, like bech-32 does. I have a repo with a folder for some of these kind of word lists: https://github.com/ChristopherA/iambic-mnemonic/tree/master/word-lists
@ChristopherA yes, most of your suggestions were applied while creating the new word list
have you tried getting to a wordlist with 3 unique letter prefix? maybe check google 1-grams http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
if you can just type 3 letters to identify a word it makes recovery a lot faster.
@FelixWeis 3 letters are not enough. We are working on easy recovery via T9-like keyboard (i.e. keyboard with 9 buttons only) and 3 letters provide only 1000 options (we need 1024 words).
I would like to know if the library of Ian Coleman, is secure and if can be used in real situation? https://iancoleman.io/shamir39/ (Side-channels, Tamper-resistant ...etc)
I would like to know if the library of Ian Coleman, is secure and if can be used in real situation? https://iancoleman.io/shamir39/ (Side-channels, Tamper-resistant ...etc)
Ian Coleman's shamir39 library is not compatible with SLIP-0039.
Closing. AFAIK every suggestion was discussed at #RebootingWebOfTrust event and good ones were implemented in the standard.
Re: https://github.com/satoshilabs/slips/blob/master/slip-0039.md
(copied from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-September/016418.html)
I and a number of companies & communities I am involved with are very interested in this.
A challenge is that Shamir Secret Sharing has subtleties. To quote Greg Maxwell:
Some questions for you:
What other teams or communities besides Trezor are committed to standardizing a Shamir Secret Sharing Scheme? I can say that the
RebootingWebOfTrust community (meeting again for the 7th time next week in
Toronto https://rwot7.eventbrite.com) are very interested.
Where do you want to hold discussions on this? Do people object to having this discussion on this mailing list? Or should it be issues in SLIPS repo or on some other mailing list?
Presuming a successful split of secrets, I don’t know all the adversarial problems that are associated with recovery of a SSS. As this would be an interactive event, I presume an attacker can DOS a request to reassemble keys (so maybe some the of integrity of each share vs all is required). And of course there are the biggest problems: impersonation of a reassembly request and a MitM of a reassembly request. Are there other attacks? Are you trying to mitigate any of these?
Two comments:
The Lightning Network community has added to their BIP32 mnemonics the ability to have a birthday in the seed, to make it easier to scan the blockchain for keys, as well as a byte with some way to know how to derive keys paths for it. I don’t seee a BOLT for this (it was mentioned in https://bitcoin.stackexchange.com/questions/74805/what-is-birthday-in-the-context-of-bip39-lightning-seed-generation) I would suggest that you also get some of their latest thoughts and incorporate them.
I worked with Chris Vickery while at Blockstrham on various possible ways to improve mnemonic word lists. I’m not suggesting that you necessarily go as far as we did to try to create a mnemonic that is iambic pentameter poetry (inspired by https://www.isi.edu/natural-language/mt/memorize-random-60.pdf), however, we did find sources for words that are concrete (for example table is more concrete than truth http://crr.ugent.be/papers/Brysbaert_Warriner_Kuperman_BRM_Concreteness_ratings.pdf ) or have strong emotional valence attachment (truth is more emotional than table), both of which make can words more memorable. I also found lists of words that are hard to pronounce unless you are English native, and eliminated them from my own list.
Among the results of this was a new BIP-39 2048 word compatible word list filtered for memorability (concreteness & emotional valence) and suitability for iambic pentameter, which is located:
https://github.com/ChristopherA/iambic-mnemonic/blob/master/word-lists/iambic-wordlist.json
…which was created from the repo at
https://github.com/ChristopherA/password_poem
You can a number of other word lists that I’ve collected here https://github.com/ChristopherA/iambic-mnemonic/blob/master/word-lists/
If you want to replicate what we did with your own criteria, you may want to incorporate information from the CMU dictitionary http://www.speech.cs.cmu.edu/cgi-bin/cmudict, the top 5000 words https://github.com/ChristopherA/password_poem/blob/master/top5000.json, concrete word lists http://crr.ugent.be/papers/Concreteness_ratings_Brysbaert_et_al_BRM.txt and emotional words (valence) http://crr.ugent.be/archives/1003