jcushman / libgfshare

Fork of http://www.digital-scurf.org/software/libgfshare
Other
12 stars 6 forks source link

Non-upstreamed changes #1

Closed kinnison closed 4 years ago

kinnison commented 9 years ago

Hi,

I am the libgfshare author and I was wondering if you ever had any plans to discuss your changes and consider upstreaming anything?

jcushman commented 9 years ago

Hi! Thanks for checking in and for your work on libgfshare.

I'm not sure my changes are all that useful, but see what you think.

If any of that's useful you are of course more than welcome to pull upstream, but I'm not using this enough/competent enough to want to submit a patch. :) For now I'm just using this repo to keep track of things I was messing around with.


The larger issue I'd love to kick around with you is whether gfsplit is presently over-promising in terms of security, and what might be a better way to present it.

Shamir secret sharing in theory offers the property in your readme, that "any less than N shares yields no information whatsoever." But that is only true if the coefficients are truly random and independent.

/dev/urandom is not truly random -- it's an AES stream that is occasionally reseeded from the system entropy pool. So gfsplit isn't provably secure, it's only secure-to-the-extent-AES-is-secure. At best I'm pretty sure an attacker could brute force some amount of info from a gfsplit share by brute forcing 2^256 AES keys (or whatever key length your OS uses for /dev/random).

If you trust AES, there is an alternate algorithm that is much more space efficient for files longer than a few dozen bytes: generate an AES key, encrypt the input file, and hand out Shamir splits of the key along with Reed-Solomon splits of the input file. You get the same security, but shares are the size of the input file divided by the recovery threshold (plus a small constant header) instead of being the same size as the input.

Alternatively, if info-theoretic security is really important for your use case (which it is for mine), you have to make sure that you're using RDSEED (if you trust that) or another true RNG.

So I'm thinking that a tool like this should either insist on a hardware TRNG; or use the AES+Reed-Solomon technique; or explain to users that it doesn't offer info-theoretic security or space efficiency as-is (which might be fine for the primary usecases like backing up GPG keys).

I dunno -- what do you think?

kinnison commented 9 years ago

Firstly, sorry for taking so long to get back to you.

I will look at sorting something less drastic for the macos fix. I'm unlikely to pull in the random/urandom stuff into the core library since that bloats it for platforms which don't have those APIs (and yes it is in use on those platformI will cs). The aes+reed-solomon idea is interesting. I shall have to ponder it.

Thanks again for your input, and enjoy your use of libgfshare :)

jcushman commented 9 years ago

Re: random/urandom, my feeling is that it would be better to offer no default randomness provider at all, and require implementers to make an informed decision about providing one, than to provide a non-secure default. This avoids the mistake I've made myself (being a dummy) of implementing libgfshare correctly, then breaking my random-function pointer setting, and having it fall back silently on random() instead of breaking noisily.

Defaulting to /dev/urandom (if available) gets you a reasonable default on Mac/Linux/BSD, and otherwise falls back to the explicit-opt-in situation I described above, which is better than a non-secure default.

(I realize this would be a backwards-incompatible change and isn't something to do lightly.)

Re: AES+RS, if you want to look into it, here's a paper laying out the rationale for that approach more formally:

http://link.springer.com/chapter/10.1007%2F3-540-48329-2_12

Basically I think it's important in the libgfshare docs to make clear that it only provides info-theory secrecy if it's fed by a TRNG. Particularly with files much larger than the entropy pool, if you use /dev/urandom you're getting something very different from the guarantees offered by Shamir secret sharing in theory.

And then if it turns out most users are satisfied with computational secrecy for large files, it's worth providing (or pointing them towards) techniques that are much more space efficient.

(Hope this doesn't come across as critical -- just stuff I've been chewing on recently.)

hyperbolicTom commented 8 years ago

This patch fixes a crash on my machine. I'm using libgfshare inside Python, and somehow things are being freed in a funny order. Filling the context with random numbers makes the next free fail. (Yes I know there shouldn't be one.) Zeroing the struct instead means the pointers will be null and nothing will happen, so this is safer. The second hunk of this patch seems to be a bug? Why start one byte off? You're skipping the first byte.

diff --git a/src/libgfshare.c b/src/libgfshare.c index 1db179d..955291e 100644 --- a/src/libgfshare.c +++ b/src/libgfshare.c @@ -135,15 +135,15 @@ gfshare_ctx_init_dec( unsigned char* sharenrs, return ctx; }

-/* Free a share context's memory. / +/ Free a share context's memory. Wipe it safely first. _/ void gfshare_ctx_free( gfsharectx ctx ) {

@@ -174,7 +174,8 @@ gfshare_ctx_enc_getshare( gfshare_ctx* ctx, unsigned char coefficient_ptr = ctx->buffer; unsigned char share_ptr; for( pos = 0; pos < ctx->size; ++pos )

hyperbolicTom commented 8 years ago

To be clear, in the above patch, remove the fill_rand calls and replace with memset.

In the second hunk the (p++) line seems wrong, the p++ line seems right.

kinnison commented 8 years ago

The second hunk is crazy, foo = p++ is exactly the same (code-wise) as foo = (p++) only slightly less easy to read.

The first hunk I'm less sure about -- if you're getting crashes, and the random fill is not exceeding bounds in the allocated chunks (checkable with valgrind) then it's crazy to expect that zeroing rather than random fill is going to fix anything buggy in libgfshare.

hyperbolicTom commented 8 years ago

Yeah, I'm sorry about that, I realized later they were the same. In all my years I've never seen "*(p++)" and I got confused.

The bug, here, is that I'm calling this from Python, and Python doesn't give you much control over memory allocation and deallocation (that's kind of the point). And I realize that filling with random is a useful debugging tool. It found this problem for sure. Somewhere, it's using values from the deallocated memory. I know this is a bug. Although, free(0) is supposed to work.

So, no problem with libgfshare. No. Also I'm calling it from a Cython wrapper. I don't want to go debugging in there. Never mind.

On Tue, Oct 27, 2015 at 3:49 AM, Daniel Silverstone < notifications@github.com> wrote:

The second hunk is crazy, foo = p++ is exactly the same (code-wise) as foo = (p++) only slightly less easy to read.

The first hunk I'm less sure about -- if you're getting crashes, and the random fill is not exceeding bounds in the allocated chunks (checkable with valgrind) then it's crazy to expect that zeroing rather than random fill is going to fix anything buggy in libgfshare.

— Reply to this email directly or view it on GitHub https://github.com/jcushman/libgfshare/issues/1#issuecomment-151404297.

"A man of genius makes no mistakes. His errors are volitional and are the portals of discovery." -- James Joyce