selfcustody / krux

Open-source signing device firmware for Bitcoin
https://selfcustody.github.io/krux/
Other
175 stars 34 forks source link

Double mnemonic #432

Closed tadeubas closed 1 month ago

tadeubas commented 1 month ago

PR to fulfill this issue: #318

What is the purpose of this pull request?

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 98.09524% with 2 lines in your changes missing coverage. Please review.

Project coverage is 94.20%. Comparing base (af4b621) to head (478730c). Report is 45 commits behind head on develop.

Files Patch % Lines
src/krux/pages/login.py 93.93% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #432 +/- ## =========================================== + Coverage 94.15% 94.20% +0.04% =========================================== Files 58 59 +1 Lines 7189 7264 +75 =========================================== + Hits 6769 6843 +74 - Misses 420 421 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

odudex commented 1 month ago

Before we proceed with this PR, I would like to see a more detailed context, including a clear definition of the feature and the term 'double mnemonic.' Additionally, the goals and demands being addressed. The docs should also include a description of the functionality, and a tool to generate 'double mnemonics' must be created, as it does not currently exist in Krux.

tadeubas commented 1 month ago

Finished this PR for review!

jdlcdl commented 1 month ago

As of latest optimization, uncontrolled testing (using Amigo, doing double-mnemonic again and again, watching the usb console) is showing that a try takes just less than 6ms.

My testing over many variations of this pr, finding 77 double mnemonics indicates that a double was found roughly once in 246 tries. Is my math correct here on the expected probability being 1/256? In Tadeu's implementation here, a try is done knowing the first and the second valid 12-word mnemonics, therefore putting these together into a 24-word mnemonic will have a valid last word 1 in 256 times just like any other 23 words + another word will have the same probability of being valid? ...and that is very different than an incorrect intuition that 24 scrambled words might have a 1 in 256 chance of being a valid double mnemonic. Rather, 24 scrambled words would have a 1/256 chance of being a valid 24w mnemonic, then the first 12w would have a 1/16 chance of being valid, and the second 12w would also have a 1/16 chance of being valid... for once in 16 16 256 scrambled-24word phrases 1/65536???

Nice job on implementation, documentation, and automated tests!

jdlcdl commented 1 month ago

... for once in 16 16 256 scrambled-24word phrases 1/65536???

Answering my second question above with brute-force -- might add "and ignorance" ;)

import random
from embit import bip39

tries = 0
found = 0

while tries < 2**24:
    tries += 1
    words24 = random.choices(bip39.WORDLIST, k=24)
    if (
        bip39.mnemonic_is_valid(" ".join(words24[:12]))
        and bip39.mnemonic_is_valid(" ".join(words24[12:]))
        and bip39.mnemonic_is_valid(" ".join(words24))
    ):
        print(f"double: \"{' '.join(words24)}\" found on try: {tries}")
        found += 1

if found:
    print(f"found {found} / tried {tries}: 1/{tries/found}")
else:
    print(f"found none, tried {tries}")

Which reported:

...
found 267 / tried 16777216: 1/62836.01498127341

...while I was expecting to find 254; It's "in the ballpark" to me.

tadeubas commented 1 month ago

... for once in 16 16 256 scrambled-24word phrases 1/65536???

Answering my second question above with brute-force -- might add "and ignorance" ;)


import random
from embit import bip39

can't believe you've used embit version to do this brute-force 🤣

odudex commented 1 month ago

Great work! This PR goes beyond the cool feature and brings many other good optimizations. I liked them all! Only minor detail I would change is restore default touch behavior, that is to validate position(ignoring touches with incomplete or invalid metadata) as default, for stability reasons. Other than that I believe we could move on and merge it!

tadeubas commented 1 month ago

Thank you! Regarding the change in the default touch behavior, what are the stability reasons? Every call that was passing the False/True argument is behaving the same way as before, only the default has been changed to cover most cases (hence the default).

jdlcdl commented 1 month ago

Storing notes about my findings of krux.bip39, which replaces embit.bip39.mnemonic_to_bytes() as 1) a complete rewrite using big integers that I'll argue is easier to read/understand, and more importantly 2) which has significant performance enhancements that pay-off when brute-forcing mnemonics.

I had originally considered submitting a pr to embit with these changes (even though I find it arguably scandalous for an anonymous contributor to rewrite such a well reviewed/tested-in-production/stable function from a well respected developer) but I needed testing/measuring of the performance differences on at least a small set of hardware devices.

There are two major differences in the implementation of krux.bip39.mnemonic_to_bytes() which explain the significant speedup.

The rest of the changes in krux.bip39.mnemonic_to_bytes() implementation might gain an extra 2-8% increase in performance, but not more than that on any of the devices I tested.

Because embit is geared towards limited-resource micro-controllers, I will plan only to submit a "safe" pull-request to embit which addresses the first performance boost above via the try/except block. As for the second WORDLIST dict, while it succeeded without memory errors in all of my device tests, I still believe it will be a contentious change for embit.

But, I have good news. Both embit.bip39.mnemonic_is_valid() and embit.bip39.mnemonic_to_bytes() accept an optional wordlist parameter and 3rd party apps, like krux's double-mnemonic code, can pass in a more efficient structure if they need added performance. If/when embit implements the first speedup, we can ditch krux.bip39 and use this trick.

# this dict is more efficient than calling list.index(word)
class WordIndex(dict):
    def index(self, word):
        return self[word]

# if device has enough memory and is brute-forcing, then use a WordIndex object
wordindex = WordIndex({word: i for i, word in enumerate(bip39.WORDLIST)})
bip39.mnemonic_is_valid(mnemonic, wordlist=wordindex)
bip39.mnemonic_to_bytes(mnemonic, wordlist=wordindex)

Notes related to my testing included below:

krux.bip39 using WORDINDEX returns in X% of the time and is Y times faster than embit currently:
* ubuntu amd64-Xeon:     ~2%, ~50x faster
* ubuntu arm64-pi4:      ~3%, ~30x faster
* raspios arm-pi0W:      ~2%, ~57x faster
* MaixPy k210-Amigo:    ~18%,  ~6x faster
* upython esp32s3wroom:  ~5%, ~22x faster
* upython rp2040-picoW: ~16%,  ~6x faster

krux.bip39 using WORDINDEX returns in X% of the time and is Y times faster than embit using `try/except` speedup:
* ubuntu amd64-Xeon:     ~4%, ~26x faster
* ubuntu arm64-pi4:      ~6%, ~16x faster
* raspios arm-pi0W:      ~3%, ~30x faster
* MaixPy k210-Amigo:    ~56%,  ~2x faster
* upython esp32s3wroom: ~20%,  ~5x faster
* upython rp2040-picoW: ~45%,  ~2x faster

krux.bip39 w/o WORDINDEX returns in X% of the time and is Y times faster than embit currently:
* ubuntu amd64-Xeon:    ~50%,  ~2x faster
* ubuntu arm64-pi4:     ~55%,  ~2x faster
* raspios arm-pi0W:     ~50%,  ~2x faster
* MaixPy k210-Amigo:    ~30%,  ~3x faster
* upython esp32s3wroom: ~21%,  ~5x faster
* upython rp2040-picoW: ~34%,  ~3x faster

krux.bip39 w/o WORDINDEX returns in X% of the time and is Y times faster than embit using `try/except` speedup:
* ubuntu amd64-Xeon:    ~95%,  insignificant
* ubuntu arm64-pi4:     ~93%,  insignificant
* raspios arm-pi0W:     ~94%,  insignificant
* MaixPy k210-Amigo:    ~96%,  insignificant
* upython esp32s3wroom: ~97%,  insignificant
* upython rp2040-picoW: ~98%,  insignificant

Since I still believe the rewrite is scandalous and heresy, I'll plan instead to pr the "safe" try/except speedup to embit.

odudex commented 1 month ago

Thank you! Regarding the change in the default touch behavior, what are the stability reasons? Every call that was passing the False/True argument is behaving the same way as before, only the default has been changed to cover most cases (hence the default).

The majority of touch handling code necessitates a valid position. Failing to provide one can result in unpredictable behavior, which is why I consider position validation to be default behavior.

tadeubas commented 1 month ago

Returned the previous behavior to input validate_position

odudex commented 1 month ago

Thank you!