dsprenkels / sss-cli

Command line program for secret-sharing strings
MIT License
68 stars 20 forks source link

Use erasure coding for ciphertext #8

Closed ansemjo closed 6 years ago

ansemjo commented 6 years ago

This could be an enhancement.

As far as I can tell the full ciphertext is simply appended to the key share, i.e. everything from byte 34 onward is identical in every 'shard'.

For testing purposes I manually encrypted a file with a random keyfile, used secret-share-split only on the keyfile and then 'split' the encrypted file with zfec, which uses an erasure code and has similar threshold settings (i.e. 'split into m files' and 'require k to restore'). This resulted in a considerably smaller size compared to directly splitting the file with secret-share-split - especially useful for larger secrets.

I am not sure if this would introduce any new sidechannels but I imagine it would also make recombination more robust, if anything.

dsprenkels commented 6 years ago

Thanks for the issue. While I think that erasure encoding is really powerful I do not really think that it is in scope here. After all, this is a Shamir secret sharing tool.

However, it may be an idea to refer to erasure code based threshold scheme implementations. If you like you can perhaps add a note to the README.md file referring to zfec?

ansemjo commented 6 years ago

Just to be clear: I don't want to use zfec instead of secret-share-split but I would like to use some form of erasure coding to reduce the output size of your Shamir secret sharing implementation. (Wonderful talk at 34C3, by the way!)

Since this is a commandline tool intended to 'split' files (in a very broad sense), I think it would be useful to be able to split larger files without unnecessarily 'wasting' space. It doesn't need to be zfec. That was just the most easily accessible commandline tool for me.

Generally, I do agree with the KISS principle, i.e. do one thing and do it well .. and I don't mean to change the sss library or sss-rs binding.


To demonstrate size savings, I hacked together some bash script (which also converts final output to base64, #7):

$ for input in "echo 'Hello, World.'" "cat shards.sh" "cat img.jpg"; do
  hex=$($input | secret-share-split -n 4 -t 3 | wc -c)
  b64=$($input | secret-share-split -n 4 -t 3 | ./shards.sh 4 3 | wc -c)
  printf 'w/o pipe: %07d, with pipe: %07d\n' $hex $b64
done
w/o pipe: 0000524, with pipe: 0000260
w/o pipe: 0003932, with pipe: 0001012
w/o pipe: 1037684, with pipe: 0230740

shards.sh:

#!/usr/bin/env bash

# shares on stdin
shares=($(cat))

# split one data blob
t=$(mktemp --directory --tmpdir sharding.XXXXXXXX)
echo "${shares[0]}" | head -1 | tail -c +67 | xxd -r -p > "$t/text"
zfec -m "$1" -k "$2" "$t/text"

# generate base64 shards
for i in $(seq 0 $(($1 - 1))); do
  ( echo "${shares[$i]}" | head -c 66 | xxd -r -p; cat "${t}/text.${i}_${1}.fec"; ) | base64 -w0;
  echo;
done

# cleanup tempdir
shred "$t"/*
rm -rf "$t"

Edit 1: And a script to reverse the process:

#!/usr/bin/env bash

tohex() { xxd -p | tr -d \\n; }
temp=$(mktemp --directory --tmpdir recombine.XXXXXXX)

keys=()
i=0

while read line; do
  keys[$i]=$(echo "$line" | base64 -d | head -c 33 | tohex;);
  echo "$line" | base64 -d | tail -c +34 > "$temp/data.$i";
  ((i++));
done

data=$(zunfec -fo /dev/stdout "$temp/data".* | tohex;)

for key in ${keys[@]}; do
  echo "${key}${data}";
done

rm -rf "$temp"

Usage becomes:

# create shares or 'shards'
$ i=0; echo 'Hello, World!' | secret-share-split -n 4 -t 3 | ./shards.sh 4 3 | while read line; do echo $line > hello.$((i++)); done
# delete one
$ rm -f hello.2
# recombine
$ cat hello.* | ./unshard.sh | secret-share-combine 
Hello, World!

Those two scripts basically implement the format that I would wish sss-cli provided. :)

Edit 2: created a gist for the scripts

dsprenkels commented 6 years ago

I kinda like the idea of erasure coding, but I think it is out of scope for a Shamir secret sharing library. I suspect we could do secret sharing by only using erasure coding, not needing any Shamir secret sharing. It would probably be cool to implement in a separate tool. I may even be able to help in implementing it.

Go for it. I'll probably drop a note in the readme file. :)

For now I will be closing this as wontfix.

ansemjo commented 6 years ago

Aye. I get your point. I don't think erasure coding alone would give you the same cryptographic properties though?

If I figure out how to bind to your library from Python, I could just use that and zfec as modules .. Haven't looked into that yet.

dsprenkels commented 6 years ago

No, I you're right. I was thinking of using erasure codes as a method for secret sharing instead of redundancy.

If I figure out how to bind to your library from Python, I could just use that and zfec as modules. Haven't looked into that yet.

I currently have no Python bindings yet (see https://github.com/dsprenkels/sss/issues/3) and I unfortunately do not have the time to make them myself. If these are done this should be quite easy and really cool.

ansemjo commented 6 years ago

fyi: I familiarized myself with Golang and put together your Go binding with a few other parts and built a commandline tool that outputs the data in a format as I desired it: ansemjo/go-shamirsplit

I took the easy route and combined your library with ed25519 signatures instead of implementing Feldman shares myself. Hopefully I didn't make any grave mistakes when plugging the parts together either.