Open mmokrejs opened 1 month ago
On Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz I get:
$ openssl speed -evp chacha20
Doing chacha20 for 3s on 16 size blocks: 69709801 chacha20's in 3.00s
Doing chacha20 for 3s on 64 size blocks: 30841787 chacha20's in 3.00s
Doing chacha20 for 3s on 256 size blocks: 15912826 chacha20's in 3.00s
Doing chacha20 for 3s on 1024 size blocks: 8240465 chacha20's in 3.00s
Doing chacha20 for 3s on 8192 size blocks: 1074814 chacha20's in 3.00s
Doing chacha20 for 3s on 16384 size blocks: 539367 chacha20's in 3.00s
OpenSSL 1.1.1w 11 Sep 2023
built on: Fri Oct 13 21:05:46 2023 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -pipe -march=native -ftree-vectorize -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
chacha20 371785.61k 657958.12k 1357894.49k 2812745.39k 2934958.76k 2945662.98k
$
$ openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 121275087 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 64 size blocks: 43609451 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 256 size blocks: 11103222 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 2781847 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 347985 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 174283 aes-256-cbc's in 3.00s
OpenSSL 1.1.1w 11 Sep 2023
built on: Fri Oct 13 21:05:46 2023 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -pipe -march=native -ftree-vectorize -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 648963.68k 933446.44k 947474.94k 949537.11k 950231.04k 951817.56k
$
$ openssl speed -evp aes-256-gcm
Doing aes-256-gcm for 3s on 16 size blocks: 84184815 aes-256-gcm's in 2.99s
Doing aes-256-gcm for 3s on 64 size blocks: 55949017 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 256 size blocks: 25053937 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 1024 size blocks: 9488193 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 8192 size blocks: 1409041 aes-256-gcm's in 3.00s
Doing aes-256-gcm for 3s on 16384 size blocks: 714001 aes-256-gcm's in 3.00s
OpenSSL 1.1.1w 11 Sep 2023
built on: Fri Oct 13 21:05:46 2023 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -pipe -march=native -ftree-vectorize -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-gcm 450487.30k 1193579.03k 2137935.96k 3238636.54k 3847621.29k 3899397.46k
$
$ openssl speed -evp aes-128-gcm
Doing aes-128-gcm for 3s on 16 size blocks: 90338227 aes-128-gcm's in 2.99s
Doing aes-128-gcm for 3s on 64 size blocks: 58401996 aes-128-gcm's in 2.99s
Doing aes-128-gcm for 3s on 256 size blocks: 30035335 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 1024 size blocks: 12146898 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 8192 size blocks: 1931301 aes-128-gcm's in 3.00s
Doing aes-128-gcm for 3s on 16384 size blocks: 984404 aes-128-gcm's in 3.00s
OpenSSL 1.1.1w 11 Sep 2023
built on: Fri Oct 13 21:05:46 2023 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: x86_64-pc-linux-gnu-gcc -fPIC -pthread -m64 -Wa,--noexecstack -O2 -pipe -march=native -ftree-vectorize -fno-strict-aliasing -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_NO_BUF_FREELISTS
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-gcm 483415.26k 1250076.17k 2563015.25k 4146141.18k 5273739.26k 5376158.38k
$
On Qualcomm Snapdragon 888 2.84GHz (SM8350, Kryo 680) I get:
~ $ openssl speed -evp chacha20
Doing ChaCha20 ops for 3s on 16 size blocks: 47637197 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 64 size blocks: 19894025 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 256 size blocks: 9541792 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 1024 size blocks: 3261224 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 8192 size blocks: 411641 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 16384 size blocks: 202640 ChaCha20 ops in 2.96s
version: 3.2.1
built on: Tue Feb 20 04:44:24 2024 UTC
options: bn(64,64)
compiler: aarch64-linux-android-clang -fPIC -pthread -Wa,--noexecstack -Qunused-arguments -fstack-protector-strong -Oz -DNO_SYSLOG -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZLIB_SHARED -DNDEBUG -I/data/data/com.termux/files/usr/include
CPUINFO: OPENSSL_armcap=0xbd
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
ChaCha20 257498.36k 430141.08k 825236.06k 1128207.22k 1139244.28k 1121639.78k
~ $ openssl speed -evp chacha20
Doing ChaCha20 ops for 3s on 16 size blocks: 47928059 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 64 size blocks: 20319820 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 256 size blocks: 9590102 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 1024 size blocks: 3252023 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 8192 size blocks: 418673 ChaCha20 ops in 2.96s
Doing ChaCha20 ops for 3s on 16384 size blocks: 212663 ChaCha20 ops in 2.96s
version: 3.2.1
built on: Tue Feb 20 04:44:24 2024 UTC
options: bn(64,64)
compiler: aarch64-linux-android-clang -fPIC -pthread -Wa,--noexecstack -Qunused-arguments -fstack-protector-strong -Oz -DNO_SYSLOG -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZLIB_SHARED -DNDEBUG -I/data/data/com.termux/files/usr/include
CPUINFO: OPENSSL_armcap=0xbd
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
ChaCha20 259070.59k 439347.46k 829414.23k 1125024.17k 1158705.82k 1177118.44k
~ $ openssl speed -evp aes-256-gcm
Doing AES-256-GCM ops for 3s on 16 size blocks: 70888598 AES-256-GCM ops in 2.96s
Doing AES-256-GCM ops for 3s on 64 size blocks: 44086448 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 256 size blocks: 19245107 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 1024 size blocks: 6050901 AES-256-GCM ops in 2.96s
Doing AES-256-GCM ops for 3s on 8192 size blocks: 863418 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 16384 size blocks: 421934 AES-256-GCM ops in 2.95s
version: 3.2.1
built on: Tue Feb 20 04:44:24 2024 UTC
options: bn(64,64)
compiler: aarch64-linux-android-clang -fPIC -pthread -Wa,--noexecstack -Qunused-arguments -fstack-protector-strong -Oz -DNO_SYSLOG -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZLIB_SHARED -DNDEBUG -I/data/data/com.termux/files/usr/include
CPUINFO: OPENSSL_armcap=0xbd
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-256-GCM 383181.61k 956451.75k 1670083.86k 2093284.67k 2397667.88k 2343378.53k
~ $ openssl speed -evp aes-256-gcm
Doing AES-256-GCM ops for 3s on 16 size blocks: 68787102 AES-256-GCM ops in 2.96s
Doing AES-256-GCM ops for 3s on 64 size blocks: 43635690 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 256 size blocks: 18168661 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 1024 size blocks: 6089534 AES-256-GCM ops in 2.96s
Doing AES-256-GCM ops for 3s on 8192 size blocks: 818217 AES-256-GCM ops in 2.95s
Doing AES-256-GCM ops for 3s on 16384 size blocks: 424565 AES-256-GCM ops in 2.95s
version: 3.2.1
built on: Tue Feb 20 04:44:24 2024 UTC
options: bn(64,64)
compiler: aarch64-linux-android-clang -fPIC -pthread -Wa,--noexecstack -Qunused-arguments -fstack-protector-strong -Oz -DNO_SYSLOG -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DZLIB_SHARED -DNDEBUG -I/data/data/com.termux/files/usr/include
CPUINFO: OPENSSL_armcap=0xbd
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-256-GCM 371822.17k 946672.60k 1576670.24k 2106649.60k 2272147.00k 2357990.83k
~ $
So, for 64 size blocks 20319820 ChaCha20 vs. 43635690 AES-256-GCM on a mobile phone shows AES is actually faster on the same device. And btw., the phone is not that much slower than a laptop with aes` in CPU, doing for 64 size blocks 30841787 chacha20's vs. 55949017 aes-256-gcm's. Anyway, nobody is going to use mobile phone to download genomic data and definitely not to use as a submission host.
Useful overview: https://csrc.nist.gov/Projects/fips-140-3-transition-effort
You will NOT find ChaCha20
nor BLAKE
nor Salsa
ciphers in Algorithm:
dropdown ENUM
listings at https://csrc.nist.gov/projects/cryptographic-module-validation-program/validated-modules/search under the Advanced tab.
Probably easiest is to look for approved algos from openssl of some large vendor:
which leads to https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4746 (openssl) https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4434 (Linux kernel crypto)
and
which leads e.g. to https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4292 (openssl) https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4366 (Linux kernel crypto)
or
which leads e.g. to https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4739 (Linux kernel crypto)
I think more interesting read will be which of the Block Cipher Modes of Operation should be used for genomic data, provided every single personal genome assebly will start with chr1, continue with telomeric repeat, likewise if similar if not even exactly same organization/ordering of data in SAM/BAM/FASTQ will be more exploitable than others. And yeah, the TLS filesize limits and security implications enforced by different algo modes. I know, it is tough.
crypt4gh is a standard for file storage at rest. If you want to transmit files securely, you should use a protocol designed for that (i.e. TLS). Yes, this does mean the data may be encrypted twice if you transmit it.
ChaCha20-Poly1305 was chosen because it was already used in existing standards, has good library support and is relatively easy to use. AES-GCM mode was considered at the time, but mainly rejected due to the limitation on the amount of data that could be encrypted under a single key (around 64Gb, see also NIST Special Publication 800-38D section 8.3). As genomic data files can often be bigger than this, it would have introduced some complication around the need to encrypt large files using more than one key. There have been calls for an improved cipher that avoids this problem, but I would imagine that it will be a while before anything got approved.
AES-256-CBC does not provide authentication and is vulnerable to padding oracle attacks. While these problems can be worked around, it is easier to use an AEAD construct that does not have them, like AES-GCM or ChaCha20-Poly1305.
Unfortunately, as noted, ChaCha20-Poly1305 does not have FIPS approval, which does mean users who need FIPS compliance cannot currently use crypt4gh. The specification could be extended to support compliant encryption (it was designed to allow this sort of extension) if there is demand for such an upgrade. This is more likely to happen if anyone who would like to see the change is willing to help implement it.
Hi, I wondered for a while why ChaCha20-Poly1305 was selected by GA4GH for data encryption and the closest I could find were notes on fixed blocksize giving possibility to jump into middle of a stream to decrypt the content (after indexing) and the TLS would need to re-enrypt the data. To some extent, concerns were with maximum file(stream) sizes transferrable during a single network connection.
The document http://samtools.github.io/hts-specs/crypt4gh.pdf is still quite sparse on this and I wonder if explicit explanation why AES-256-GCM was not selected. The data will be transferred via servers (Intel/AMD based) so the rumors that ChaCha20 is faster on mobile devices is out of question, IMO. From reading some docs on the internet actually the key to speed are not AES instruction but already availability of SSE, SSE2 and AVX instruction and registers provides most of the benefit to ciphers utilizing tuples with matching sizes [see section 4.1 in Gimli 20190927. Moreover, on my mobile phone is only marginally slower. On the same phone, AES-256-GCM is faster than ChaCha20 on the same device. What am I missing?
I think none of these really justify use of an untested/non-certified algorithm: https://csrc.nist.gov/pubs/fips/140-3/final
Anyway, the major argument is from auditing and compliance perspective. The algorithm is not accepted by FIPS-140, not even in FIPS-140-3.
Let me quote from the FIPS-140-3 Cryptographic Module Validation Program CMVP addendum: Use of Non-validated Cryptographic Modules by Federal Agencies and Departments
Non-validated cryptography is viewed by NIST as providing no protection to the information or data—in effect the data would be considered unprotected plaintext. If the agency specifies that the information or data be cryptographically protected, then FIPS 140-2 or FIPS 140-3 is applicable. In essence, if cryptography is required, then it must be validated. Should the cryptographic module be revoked, use of that module is no longer permitted.
If
LocalEGA
andFEGA
and other tools includingsamtoos htsget
are to be deployed world-wide then commonly accepted and certified crypto must be enabled in the default. Are we at all allowed to stored data using uncertified algorithm, which is officially recognized as no protection? I propose switching to AES.Hundreds of sotware tools, Linux distros, etc., undergo validation. Obviously, tools exposing uncertified ChaCha20 are breaking eventual certification, see some examples and search for
ChaCha20
:https://csrc.nist.gov/CSRC/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp4046.pdf
https://csrc.nist.gov/CSRC/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp4284.pdf
https://csrc.nist.gov/CSRC/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp3820.pdf
More can be found at https://www.nist.gov/search?s=ChaCha20&from=2&index=all-meta-engine&order=r&rpp=10
Same applies to BLAKE2, Salsa20 and successors. BLAKE2 was also selected by GA4GH. I am not certain on compliance of X25519, based on Curve25519.
I am not a crypto expert but seems in multi-user settings AES-256-CBC would be advantageous ober -GCM. And the submission hosts will encrypt data concurrently, right?
For ChaCha20 security comparison against AES-GCM, see https://dl.acm.org/doi/abs/10.1145/3460120.3484814 , and from https://dl.acm.org/action/downloadSupplement?doi=10.1145%2F3460120.3484814&file=CCS21-fp593.mp4 I quote two slides: