bitaps-com / pybtc

Python bitcoin library
GNU General Public License v3.0
86 stars 37 forks source link

Vulnerability in the Shamir Secret Sharing Scheme Implementation #37

Open emmanueloshin opened 1 year ago

emmanueloshin commented 1 year ago

I found vulnerability in the implementation of the scheme that allows an attacker to directly recover some content of share C with only shares A and B in a 3 of 5 threshold scheme. It works in most cases but not all depending on the entropy. If a mnemonic phrase 'S' is split into 3 of 5 shares where shares 'A', 'B' and 'C' are sufficient to restore the secret 'S', if A and B only is used to restore a mnemonic phrase using the mnemonic tool and an output 'C' is generated, then the output 'C' is then put again to replace 'B' and to generate output 'D' and the process is repeated on and on, the outputs C, D,........... formulates the words that consists of C.

For example, S = stuff execute bounce auto brisk orbit creek ticket miracle bleak desk audit This passphrase was spit using the mnemonic tool and using a 3 of 5 threshold scheme in which A, B and C are sufficient to reproduce the secret 'S' where

A = lend green anchor album custom grape repeat easily inflict ring million plate

B = fuel shove embrace track photo truly cart supply action old fancy rent

C = buffalo eagle copy main orient toe brown clump draft negative split ride

When A and B are used to restore memonic we get A + B => D: D = giggle abuse marine emotion stereo onion demand soft found foam wild dust D doesn't have any words that C has but continuing the process

A + D => E

E = invest duty remove science angry crouch bitter target palm buffalo bulk twelve WE HAVE 1 PHRASE 'BUFFALO'

A + E => F

F = quit fun robust viable toe fold city tragic view ladder powder meat WE HAVE ANOTHER PHRASE 'TOE'

A + F => Non unique or invalid shares

D + B => Non unique or invalid shares

E + B => G

G = fiber later dynamic ride below stadium magnet alien lab high bachelor favorite WE HAVE ANOTHER PHRASE 'RIDE'

F+B => Non unique or invalid shares

G+B => Non unique or invalid shares

D+ E => H

H = grab copy pulp large stomach donate gap canoe gloom chase often confirm WE HAVE ANOTHER PHRASE 'COPY'

D + F => Non unique or invalid shares

D + G => Non unique or invalid shares

E + F => Non unique or invalid shares

E + G  => Non unique or invalid shares

E + H => I

I = auto flavor eagle alley horror culture capital nose ranch beauty sure notice WE HAVE ANOTHER PHRASE 'EAGLE'

E + I => J

J = vanish enlist paper junk off grunt typical october abuse jump absent cart NOTHING HERE

J + I => K

K = breeze brown jump modify radio opinion auction magic indicate favorite disease define WE HAVE ANOTHER PHRASE 'BROWN'

Each of the phrases below has at least one word that make up Share 'C' every other step gave the "Non unique or invalid shares" error. Meaning that the attacker has only very few words to play with and can get the mnemonic in no time.

Following this process the attacker was able to retrieve Six words including the check sum word. BUFFALO, EAGLE, COPY, TOE, BREEZE and the check sum word RIDE.

This vulnerability directly exposes the content of Share 'C' with only two known shares in a 3 of 5 scheme implementation.

My Email: oshinemmanuel27@gmail.com BTC Address: bc1qep7ln5dn6wkmefkw2vmy2r68sw49sde78fvu5x

SmartArt09 commented 20 hours ago

I tried this method, but you also have to remember, that each time the program is run, it generates a different set of mnemonics for the same indices, and the words obtained may be present in any of the 3rd, 4th or 5th mnemonic shares (which happened in my test case).

The maximum number of mnemonic shares that can be generated in each run is 255, and each run yields different shares each time, so combining the threshold number of shares from multiple runs won't yield the same original mnemonic. For example:

The original mnemonic for testing; will be split into 3 of 5 thresholds

m = 'goose garden nuclear hire minute flavor just six reject broken ability damp'

shares1 = split_mnemonic(mnemonic=m, threshold=3, total=5) print(shares1)

Output:

182: 'dinner leaf casual toward relief exact turn arrive dutch detect stem ranch', 206: 'life engine coin ketchup exchange rate swing glide lesson control refuse boil', 15: 'plate bronze churn vast hand cruel story west chapter coin change easily', 54: 'random frown clump upset surface reject forum gap egg drastic promote few', 58: 'all pig elder twin meat reveal giant arena across hungry fire express'}

shares2 = split_mnemonic(mnemonic=m, threshold=3, total=5) print(shares2)

Output:

218: 'salute agree venue purity weird rate match swift side fault dwarf cable', 174: 'drastic tail drop season urban reopen loud buzz cook gallery apart merge', 63: 'stool dutch dilemma final predict tissue option embrace acquire boss dust mixture', 51: 'income sibling problem suspect vital trash govern apology flip liquid miracle finger', 252: 'laptop modify orange badge solid wheel hero foster excuse present lumber ripple'}

Only 3 of shares1 or 3 of shares2 will yield the original mnemonic 'm' when combined even though both of the batches may contain common words (like 'dutch' in 182 index in first run and index 63 in second run). They can't be mixed, as on every function call for 'split_mnemonic', a new list of 255 different mnemonics (max. limit) can be generated.

Do let me know if you find it that way, though.