Question about int array to byte array conversion and hash creation when processing share

Sharpiro commented 6 years ago

Hi, thanks for creating this implementation of SLIP-0039. I'm trying to use it to better understand secret sharing technology and improve my implementation

https://github.com/jakubtrnka/ShamirsSecretSharingScheme/blob/ed676df207c3241a66f0ec610cd8e0bc97b75b39/main.cpp#L138-L145

What is the purpose of converting the int vector num_share to an intermediary byte vector hex_share, before you arrive at the final byte vector data preview.data? I do not understand what the purpose of the hex_share format created by Shamir::power2ToHex, nor what kind of data format that function returns.

5-bit index	5-bit M threshold	variable-bit SSS part	16-bit checksum

Also it seems you are using this intermediary byte vector hex_share to perform checksum validation. My understanding was that in order to create a 16 bit checksum, you would take the 2 bytes representing 5-bit-index and 5-bit-m-threshold, concat that with the array of bytes representing the sss-part, and run those concatenated bytes through the SHA256 function, then take the first 2 bytes (leftmost 16 bits) as the checksum.

Thanks.

jakubtrnka commented 6 years ago

Shamir::slip39ToNum transforms string vector (i. e. words) to a vector of numbers - base1024 representation of data. Shamir::power2ToHex transforms base1024 representation of data to base16 representation, i. e. byte vector - the actual raw data, or more general: any base2^n into base16. I'm not sure if I understood how you'd compute the checksum. I just took whole share as it its, zero-padded the last byte and put the 16bits at the end of share (where the data ends - not where the vector ends). I didn't put index + threshold in 2 separate bytes. That's why I created the graphics to make it absolutely clear how this was done/meant. Of course, I could've made mistake and my program might produce wrong outputs. Implementing this bit-wise concatenating, parsing, hashing, always being aware of endianity was a nightmare.

Sharpiro commented 6 years ago

Sorry for the delay, got busy was some other things.

So it looks like part of the difference I'm seeing was how I was just seperating the first 2 bytes from the rest of the buffer, simple enough.

However, I noticed that you found a way to not require the user to input the secret size upon secret recovery by using the padding you mentioned. It looks like the adding and removing of this padding is quite involved, are you going to try to get this added to the SLIP? @onvej-sl mentioned that he expected the user would provide the secret size at recover time.

jakubtrnka commented 6 years ago

I don't use padding in this implementation. I just assume/require the secret to be multiple of 32 bits and any data is correspondingly truncated or zero-padded when moved from/into data structures. Although this is not directly stated in SLIP39, that's what currently most wallets do (many wallets produce just 128 bits as the only option). The problem w/ padding in SLIP39 is that current design for the absolute majority of usecases (128 bits) fits exactly into 17 encoding words, so any padding would require an additional (dummy) word as an indicator that padding was used, which would look kinda stupid. I also found another approach to Shamir's sharing. Ian's method looks more general - allows for arbitrarily many shares and reconstruction threshold, contains version nr, uses some kind of padding, but does that w/ close relation to BIP39, thus implicitly assuming the secret size also to be multiple of 32 bits (actually 33bits, because of the BIP39 checksum). At this point I don't have any strong opinion which one is better - both approaches have different pros and cons. Authors of SLIP39 said the intent is to compete with BIP39, not to build on top of it. From this perspective current SLIP39 is good approach IMO. Just assuming everyone will encode 128 or 256 bits (or rarely 160, 192, 224) and will not want to split into more than 32 shares and for that purpose make the simplest lightest implementation w/o any additional flags, dummy blocks, version numbers etc., seems to be ok.

Sharpiro commented 6 years ago

Thanks. I'm looking into an "optional" way to provide secret size in the event the secret size is not one of the common supported sizes as you mentioned.

I did find one oddity with this line: https://github.com/jakubtrnka/ShamirsSecretSharingScheme/blob/e7cec8832ca13cfebd5c97d1294487c748459433/src/share.cpp#L47

I noticed that your push_n function iterates through the entire vector despite it only saying 16 of the bits are significant. I tested it and regardless of whether you pass in the 32 bytes or just the first 2 bytes, you get the same answer. I just thought it was odd since in this case we are only concerned with the first 2 bytes, but all 32 bytes are passed in then all are iterated through.

Like I said, no affect on the outcome, just something I found.

jakubtrnka commented 6 years ago

You're right. It's not necessary to append whole array and then crop. Thanks for reviewing my code ;) I appreciate that.

Sharpiro commented 6 years ago

np. it's been very helpful to me.

Sharpiro commented 6 years ago

A few more feedback items before I close this. I implemented a few changes to get your code working on Windows since that is mainly what I have access to at the moment.

Are you interested in making the code compile on Windows? This gist seems to be popular and allowed me to get your code compiling on windows. Though I had to comment out this line and add this one. Although this only allowed it to build using Visual Studio 2017 and not VS Code, but i'm sure I was just missing something.
Are you interested in being able to create shares on Windows? This simple snippet let me fully run your code after making it compile via the above fix.

I haven't tested that these changes actually work on Linux, I just had these changes done and wanted to gauge your interest. If you're interested, I wouldn't mind putting some PRs together.

jakubtrnka commented 6 years ago

Yes, on windows absolutely!! I experimented with mingw32 and cross compilation, though I wasn't successful, so I gave up (unless I have more time to try play with it more). I'll check your suggestions out, thanks a lot for it.

If this project is supposed to be compilable for windows, it should be the task for CMake to configure the sources so it compiles correctly using appropriate compiler on given platform. For the first shot, hardcoding the changes is just fine.

Another problem with windows is that every executable produced is immediately labeled and destroyed by almost all antiviruses. It is not good practice to make executable binaries and instruct people to disable their PC security in order to run it. Do you have some experiences how to deal w/ that?

Sharpiro commented 6 years ago

I actually don't run any third-party antivirus software so I've never experienced the issue you mentioned.

jakubtrnka / ShamirsSecretSharingScheme

Question about int array to byte array conversion and hash creation when processing share #1