iancoleman / bip39

A web tool for converting BIP39 mnemonic codes
https://iancoleman.io/bip39/
MIT License
3.42k stars 1.42k forks source link

Could another BIP39 menmonic serve as a source of entropy? #183

Open leafcutterant opened 6 years ago

leafcutterant commented 6 years ago

I'm not certain what the technical implications would be, but I think it would be useful to be able to enter proper BIP39 mnemonics as sources of entropy.

Thanks for your great work!

iancoleman commented 6 years ago

A bip39 mnemonic is an encoding of entropy, and using it as entropy itself is muddying the waters.

The chain of encodings is

entropy > mnemonic > seed > extended key > ecdsa key > address

It's best not to mix them up with each other. Just use 'proper' entropy as described in the 'read more' section about entropy.

hatgit commented 5 years ago

I had thought of this too before reading this post, and although this was already closed, I thought it would be worth pointing out that when pasting the mnemonic as entropy, the result is a much weaker mnemonic - not stronger. I believe the reason for this is because the bits from the words are not being properly decoded as the filtered entropy string is much shorter (as the string appears to be treated as Hex and thus some characters discarded?) Perhaps better would be to convert the mnemonic to ascii then convert to binary or hex, then use the binary or hex result as entropy, instead of pasting the 12 words as entropy. I think this was also mentioned here: https://github.com/iancoleman/bip39/issues/79

Ian, perhaps it's worth looking at this because if that entropy was being properly decoded into the filtered entropy it should be a much longer filtered entropy string than the 88 bits shown below?:

screen shot 2018-08-27 at 4 14 31 pm
iancoleman commented 5 years ago

Yes I can certainly understand the sentiment here, but I'm very wary of introducing features that instil a false sense of security. Brainwallets are the classic example; including the full ascii charset for entropy is definitely a step toward similar temptations.

Perhaps a good way forward is to make a 'smart' guess about the entropy type (as happens now), but also give an option to force the character set to a different user preference. With the feature to infer the character set it only takes a single 'incorrect' character to suddenly be interpreted as a different character set, eg 000111101011101e would infer hex and treat every 0 and 1 as a 4-bit hex character, which seems dangerous to me.

So in summary the proposed changes are:

What's your thoughts about allowing utf-8 unicode for entropy?

hatgit commented 5 years ago

Thanks @iancoleman for re-opening this and for the detailed feedback, I agree and believe that anything to increase entropy would be good, but I acknowledge the potential overlap conflicts of having yet another "function getBase()" command trying to guess the input type.

screen shot 2018-08-27 at 10 26 14 pm

I think it's a great idea to give users the choice to force override the character set with some checkbox option (such as in the example in this thread were UTF-8 letters got treated as hex but their non-hex values were discarded resulting in weaker entropy). If adding a separate checkbox for Unicode would be the cleanest/safest way perhaps that is indeed the best solution.

P.S. perhaps also tweak when hex is detected for not all characters to be discarded but just the leading 0x pad (while the other characters get converted, not sure)?

hatgit commented 5 years ago

Just wanted to add that I am not sure what the result of allowing UTF8 (which has backward compatibility with ASCII) would have in situations where a user enters control characters such as those that are non-printable (in the case of pasting a string that contains such non-printable control commands), unless perhaps those control characters were decoded into their hex index value in the ASCII range. For example in Python3 using binary number 28, chr(0b0011100) = \x1c ``` which is the hex value 1C or decimal 28 for "File Separator" command, since it technically could be pasted into the tool by a user and otherwise trigger an error if not decode properly.

skironDotNet commented 5 years ago

Take a look at my version. I will create a new one easily merge-able with iancoleman. Mine got based on coinomi but it's outdated. Anyhow you trying to create a "brain wallet" in a way. Entropy in my understanding is binary source for further derivation. So if you want "brain wallet" you need to turn your phrase into the entropy. I don't remember how I did mine but it's based on current approach of SHA256 hashing the text, so your entropy is 256 bit strong. The problem is attacker knows the algorithm behind it so if you use simple text he could brute-force your simple text to get complex entropy and generate the rest, so use with caution. DON'T use my solution, as I may change the implementation in next version, but play with it https://github.com/skironDotNet/bip39-passphrase hosted here https://greenhex.net/bip39-personal-standalone.html

iancoleman commented 4 years ago

allow manual override of the character set to use (will still filter out invalid chars)

See https://github.com/iancoleman/bip39/commit/516c16d721db88b4b2c39964e2d5e8f6310c7bff - Allow manual override for entropy type

Still to do: include charsets from #79