jedisct1 / rust-aegis

AEGIS high performance ciphers for Rust.
MIT License
24 stars 4 forks source link

Stateful API to optimize MAC/checksum use cases #2

Closed DXist closed 5 months ago

DXist commented 1 year ago

I'm interested in a stateful AESGIS struct for MAC/checksum use cases.

In MAC case nonce is zero and secret message is empty. init logic could run only once on the first MAC calculation.

On subsequent runs the preinitialized state is just copied and only update and finalize functions have to run. This will decrease computation cost.

I'm also personally interested in checksum scenario with fixed secret key (zero). It could enable compile time initialization.

jedisct1 commented 1 year ago

Keep in mind that it would be completely insecure with untrusted inputs.

If the key and nonce are known, it is possible to craft messages that will cancel the state. AEGIS is a MAC but it doesn't imply that it is a hash function.

DXist commented 1 year ago

My goal is to validate protocol message header/message body checksums, log messages for persistence and check checksums on storage read to protect from silent sector read errors/misdirected reads.

So I focus on hardware fault protection rather then protection from untrusted environment / byzantine faults.

DXist commented 1 year ago

Is it possible to include benches of processing only authenticated data?

I.e. empty secret buffer and nonempty additional data.

jedisct1 commented 1 year ago

That was added to libaegis. You can run the benchmark with zig build -Dwith-benchmark -Drelease.

Results on a Zen4:

AEGIS-256    120105.42 Mb/s
AEGIS-256X2  245115.31 Mb/s
AEGIS-256X4  363015.59 Mb/s
AEGIS-128L   210626.07 Mb/s
AEGIS-128X2  407609.10 Mb/s
AEGIS-128X4  528268.03 Mb/s
AEGIS-128L MAC   254394.53 Mb/s
AEGIS-128X2 MAC  484187.39 Mb/s
AEGIS-128X4 MAC  565354.88 Mb/s

On a MacbookPro:

AEGIS-256     57529.63 Mb/s
AEGIS-256X2   90841.93 Mb/s
AEGIS-256X4   87541.12 Mb/s
AEGIS-128L   108528.06 Mb/s
AEGIS-128X2  132905.43 Mb/s
AEGIS-128X4   75118.99 Mb/s
AEGIS-128L MAC   126402.38 Mb/s
AEGIS-128X2 MAC  166496.04 Mb/s
AEGIS-128X4 MAC  125925.78 Mb/s

On an old Xeon E5:

AEGIS-256     49566.71 Mb/s
AEGIS-256X2   55820.46 Mb/s
AEGIS-256X4   46122.33 Mb/s
AEGIS-128L    98238.74 Mb/s
AEGIS-128X2   79607.21 Mb/s
AEGIS-128X4   58271.26 Mb/s
AEGIS-128L MAC    99519.00 Mb/s
AEGIS-128X2 MAC  108481.01 Mb/s
AEGIS-128X4 MAC   92552.98 Mb/s
DXist commented 1 year ago

Thank you!

Does it make sense to overwrite buffer on each benchmark iteration to work with memory instead of CPU cache?

In real life data comes outside of CPU and memory bus is involved.

I tried to use only one iteration in benchmark.zig and I get numbers closer to Rust benchmark's numbers.

jedisct1 commented 1 year ago

Yes, that's a good idea!

DXist commented 1 year ago

I think both options could be shown. A single thread application that uses a fixed preallocated buffer pool (maybe memory pinned) and runs on a dedicated core could enjoy high cache hit rate for buffers.

jedisct1 commented 5 months ago

Your wishes have come true. A new MAC API has been added in version 0.6.5.

DXist commented 5 months ago

@jedisct1 , great! Thank you!