Securely instantiating the PRG

cjpatton commented 2 years ago

In Prio3, PrgAes128.derive_seed() is used with a fixed seed for the Fiat-Shamir heuristic. We need to decide if this is safe. It would be sufficient to prove, say, that this function is indifferentiable from a random oracle when modeling AES as an ideal cipher.

Note that we might end up picking a PRG in #32 that is already safe a safe choice here, in which case we should just replace PrgAes128.

cjpatton commented 1 year ago

TODO:

[ ] Add Prg variant based on SHA-3 (SHAKE)
[ ] Either dump PrgAes128 or replace it with a is fixed-key alternative and can instantiate Prg.

cjpatton commented 1 year ago

https://eprint.iacr.org/2019/074.pdf https://eprint.iacr.org/2022/1431.pdf

cjpatton commented 1 year ago

I put together a couple prototypes of PRG alternatives in libprio: https://github.com/divviup/libprio-rs/compare/cjpatton/prg-prototypes

One is based on KangarooTwelve (a CFRG spec) and the other is based on SHA-3 (SHAKE128). Initial benchmarks are encouraging. The following table shows Prio3CountVec shard time on my laptop for various input lengths with no multithreading:

PRG	100	1000	10,000
PrgAes128 (status quo)	149.72 us	3.13 ms	22.6 ms
PrgK12 (KangarooTwelve)	156.69 us	3.15 ms	22.7 ms
PrgSha3 (SHA-3)	171.14 us	3.21 ms	23.6 ms

We still need to benchmark Poplar1. The implementation is WIP: https://github.com/divviup/libprio-rs/pull/381

cjpatton commented 1 year ago

Per @simon-friedberger's request, I've updated the benchmark branch of libprio with a PRG based on HKDF-SHA256. I played around with a few variants, all are roughly as performant:

PRG	100	1000	10,000
PrgHkdfSha256	465.88 us	n/a	n/a

(Note that input sizes of 1000 and 10,000 exceed the the maximum output length for HKDF-SHA256.)

The performance is quite bad compared to the other algorithms. I'm not sure if we're using a software implementation of SHA-256, but if we are, these numbers aren't too surprising given how much hashing is involved in HKDF.

cjpatton commented 1 year ago

@divergentdave and I have been working on Poplar1 in libprio: https://github.com/divviup/libprio-rs/pull/434

We're still working on benchmarks, but so far it looks like SHA-3 is up to 40% more expensive than PrgAes128. This confirms @schoppmp's hypothesis that Prio is arithmetic-bound and Poplar is PRG-bound.

cjpatton commented 1 year ago

We have merged the basic functionality for Poplar1 into libprio (🎉). I have also updated my PRG prototypes branch (https://github.com/divviup/libprio-rs/compare/cjpatton/prg-prototypes) with benchmarks with Poplar1 using the current PrgAes128 and the alternative PrgSha3. The table below summarizes shard and preparation timings for both variants and for various bit lengths.

PRG	16 bits	64 bits	128 bits	256
PrgAes128 shard time	49.4 us	143.8 us	270.6 us	519.5 us
PrgSha3 shard time	49.2 us	143.8 us	268.9 us	518.2 us
PrgAes128 prep time	438.2 us	1.4 ms	2.7 ms	5.3 ms
PrgSha3 prep time	667.0 us	2.4 ms	4.5 ms	8.9 ms

(*) For prep time, we're measuring the time it takes for an Aggregator to evaluate an IDPF share and compute its sketch share. We are evaluating at the last level the three, which exhibits worst case performance. Prep time is sensitive not only to the the number of candidate prefixes, but the distribution. This is because the IDPF evaluation caches intermediate results. To compute the candidate prefixes, we sample 1000 measurements from a Zipf distribution (as suggested by the original Poplar paper) and compute the prefix tree for threshold = 10.

Observations:

There is no significant difference between the shard times, regardless of the input length. (Good news for the Client.)
For the range of bit lengths tested, prep time regresses by as much as 67%.

Caveats:

We are still investigating the best caching strategy for IDPF evaluation (https://github.com/divviup/libprio-rs/issues/441). We expect we will be able to improve performance, at least somewhat, across the board.
If the application can cache per-report IDPF state, then much of this runtime would be amortized across all of the levels.

cjpatton commented 1 year ago

Conclusion: SHA-3 performs well enough in all situations except IDPF evaluation. There the performance hit is significant enough to warrant investigating an alternative that has hardware support.I propose focusing #32 on this question. Further, there is no need to design to the Prg API; something designed specifically for IDPF would make sense.

In the meantime, I propose we go ahead with replacing PrgAes128 with something based on SHA-3. Here is my proposal: https://github.com/cfrg/draft-irtf-cfrg-vdaf/pull/136

cfrg / draft-irtf-cfrg-vdaf

Securely instantiating the PRG #106