Some secrets are corrupted

DoronShapiro commented 8 years ago

To reproduce:

>>> from secretsharing import SecretSharer
>>>
>>> secret = "04bbcb1fbec99d65bf59d85c8cb62ee2db963f0fe106f483d9afa73bd4e39a8a"
>>> shares = SecretSharer.split_secret(secret, 2, 3)
>>> recovered = SecretSharer.recover_secret(shares[0:2])
>>> recovered == secret
False

This is due utilitybelt's charset_to_int behavior of dropping leading characters corresponding to a 0 in the charset when it is called on the secret.

If this is not the intended behavior from utilitybelt, a simple fix is to modify charset_to_int and int_to_charset to prepend a 1 to the returned value in order to preserve leading zeroes.

shea256 commented 8 years ago

Hi @DoronShapiro thank you for reporting this and apologies for the delay on getting to this.

In actuality, this was originally intended as a feature and not a bug, although I do concede that the UX of this is not great.

The way this library works is strings need to be encoded into the integer parameters of a curve, then that curve needs to be converted into integer points on the curve. That means that you simply have to accept that leading zeros will be dropped.

One way to get around this, as you mentioned, is to prepend a 1 to the value. This is not good, however, as it means that the points are incorrect encodings of the actual values. This means that another library that saw the shares and wanted to interpret them would have to know to remove the leading 1.

I originally considered that the way to solve this was to have developers use higher-level functions that remember the desired length of the secret and then zero-pad until they get to that length. However, I realized that this is something that the library should handle if it is to have a stellar UX.

Thus, I came up with an alternative solution, which is to count the number of leading zeros and then record that number as a third optional value in the secret encoding.

So a secret without zero-padding will look like this:

'1-3bb14c7d4e31073216b160ad284eceb84420958100a2dafb2c14f2ce9a487dde'

While one with 3 leading zeros will look like this:

1-1dbfe6a0003800fc44c245c05cff1e86cd6a53c3a7e23f06e1c779d24b60aa01-3

Would love your thoughts on this new design.

shea256 commented 8 years ago

The new design can be seen in this branch:

https://github.com/blockstack/secret-sharing/tree/zero-padding-support

DoronShapiro commented 8 years ago

Correct me if I'm mistaken, but it looks like the number of zeros is encoded in the shares rather than the secret. I don't think this fits within the secret sharing threat model - all holders of shares now know something about the secret, namely how many leading zeros it has.

Compatibility can be a concern - a cleaner way to prevent this sort of issue would definitely be a plus! Unfortunately, I don't know that there is a standard implementation here to comply with. Hopefully documenting the extra step would allow other implementations to recover secrets generated with this library.

shea256 commented 8 years ago

Good point. OK how about this - instead of indicating the number of leading zeros, we include the length that the string should be padded to? This wouldn't reveal anything about the secret.

DoronShapiro commented 8 years ago

This sounds better, although if each share contains this information, some thought may have to be put into what happens when shares disagree on the length (e.g. if one share gets corrupted).

shea256 commented 8 years ago

That's fine. The shares have to have a consistent formatting anyway. If one share gets corrupted then the recovery process will fail. Then the library user would have to swap out the invalid share and use one that is valid (or realize how to correct the share).

fionafibration commented 5 years ago

If you want uncorruptible shares, there are variant secret sharing schemes that can protect against malicious modification, or you could alternatively just add something like a Reed-Solomon error correcting code to the shares, to guard against accidental modification, or go completely overkill and use an ECDSA based signature to verify each share individually.

shea256 / secret-sharing

Some secrets are corrupted #16