cfrg / draft-irtf-cfrg-vdaf

VDAF specification
Other
17 stars 14 forks source link

Prio3Count circuit optimization #365

Closed divergentdave closed 5 days ago

divergentdave commented 1 week ago

The Count circuit currently computes $x^2-x$ using one Mul gate, and some affine gates. This could be slightly improved by instead using a polynomial evaluation gate, with arity 1 instead of 2, to compute either $x^2$ or the whole polynomial. This would save a little bit of space on the wire, and probably some computation time as well. We should benchmark the potential improvement, and then decide whether it is worth breaking wire compatibility of Prio3Count or not.

divergentdave commented 1 week ago

See divviup/libprio-rs#1090 for preliminary benchmarking results. I didn't see any CPU performance thus far, though there may still be some room for optimizing the implementation. In terms of message sizes, we would save 8 bytes on the leader input share and 8 bytes on the prepare shares.

cjpatton commented 1 week ago

Thanks for trying this out! Given that you had to roll a special gadget for this to get competitive CPU time, I would say the implementation complexity outweights the bandwidth gains. What's 8 bytes between friends?