NebulousLabs / Sia

Blockchain-based marketplace for file storage. Project has moved to GitLab: https://gitlab.com/NebulousLabs/Sia
https://sia.tech
MIT License
2.71k stars 440 forks source link

Faster encryption #2200

Open lukechampine opened 7 years ago

lukechampine commented 7 years ago

Our file encryption algorithm, Twofish, suffers from a slow implementation. I ran a simple benchmark comparing it to AES and ChaCha20:

BenchmarkEncrypt/Twofish_4MB-4               20      76569138 ns/op      54.78 MB/s
BenchmarkEncrypt/AES_4MB-4                 1000       1607839 ns/op    2608.66 MB/s
BenchmarkEncrypt/ChaCha20_4MB-4             500       2230698 ns/op    1880.26 MB/s

54 MB/s is fast enough to not introduce noticeable delay on a typical internet connection, but it will be increasingly significant as average bandwidth increases and larger clients start adopting Sia. We should either:

The latter would be easier, compatibility-wise, but it requires money, time, and faith in the developer to write secure code. If we go with the former, I'd prefer to avoid AES since it is vulnerable to timing attacks on hardware that doesn't support the AES-NI instruction set.

rudi-cilibrasi commented 7 years ago

This https://github.com/drewcsillag/twofish Twofish implementation measures 7800 MB/s on my machine. Do you think it is fast enough to escape the assembly and just write a Go connector e.g. with cgo or something like that?

DavidVorick commented 7 years ago

7800 MB/s? Is that multi-threaded? I believe that the benchmarks above are single threaded, so your stats do not compare.

That said, it's still probably a massive improvement over our 50 MB/s single-threaded performance. Just want to make sure we're using a level playing field.

Unfortunately, to keep our build process as simple as possible we need to avoid using cgo. My preference is to stay away from ChaCha20 and AES largely because their constants were not chosen using NUMS strategies. Twofish iirc does a much better job of this.

I'm also open to other potential encryption algorithms. Something like blake2b + auth would get us 600 MB/s or more using our current blake2b algorithm. While that falls short of what ChaCha20 can do, it's still a lot better than Twofish. But it also means we'd be rolling our own auth, which is another place for mistakes to slip in.

rudi-cilibrasi commented 7 years ago

I am not 100% sure, but I do know: the simple Makefile does not link with -lpthread, the substring thread does not appear in a case-insensitive search anywhere in the source code, and a cursory reading of opt2.c did not show any multi-threading. Curious if anybody finds anything suggesting it is a multi-threaded benchmark.

rudi-cilibrasi commented 7 years ago

I'm always glad for a simpler build process @DavidVorick and I don't mind not using cgo at all; I've never used it either. I just have no idea what the clean model to follow for C-golang extension integration.

lukechampine commented 7 years ago

I think gcc has become smarter in the time since that code was written...

When I first ran the C benchmark, I saw the following result:

encs/sec = 3649635036.496350
bytes/sec = 58394160583.941605
KB/sec = 57025547.445255
MB/sec = 55689.011177

55 GB/s! Hard to believe. But compile with -O1 and the performance drops to 180 MB/s. Do you see where this is going?

The benchmark, as written, allows gcc to optimize out the entire body of the encrypt function, since it has no observable side effects. But if we print the final encrypted output, this optimization is no longer possible and the performance drops back to the realm of possibility (180 MB/s). This is fairly fast, but not as fast as asm Go packages.

rudi-cilibrasi commented 7 years ago

I see yes the optimizer explanation is it. Please let me know if you think this C version speedup is worth it after considering the rest of the options.

starius commented 7 years ago

AES works so fast because of AES-NI (triggered by the ASM code) not because of ASM implementation itself.

Comparison of twofish vs aes in Go:

Encryption of 1GB with twofish takes 14.391475843s.
Encryption of 1GB with AES takes 2.590288861s.

(source)

fzgregor commented 7 years ago

There is this assembly AES code from Intel. If I remember correctly (has been a while), it achieves more than 5 GB/s single threaded on a Skylake processor. Creating a Go package for this library should take considerable less effort than reinventing it and require less trust in the developer. This package could also be built in a separate project keeping Sia's build system clean.

lukechampine commented 7 years ago

Go's standard library implementation of AES is fine. The question is whether we're okay with using a cipher that requires special hardware to ward off timing attacks, as well as @DavidVorick's point about NUMS constants.

ChaCha20's magic numbers seem okay:

   The ChaCha20 state is initialized as follows:

   o  The first four words (0-3) are constants: 0x61707865, 0x3320646e,
      0x79622d32, 0x6b206574.

These may look suspicious at first, but it's actually just the string "expand 32-byte k" split into 4 uint32s.

rudibs commented 7 years ago

that expand 32-byte k thing is pretty good evidence that we need not be worried yet given the reality of what happened with DES, NSA, and differential cryptanalysis we can never be too careful. Thanks, BTW! ;-)