cfrg / draft-irtf-cfrg-vdaf

VDAF specification
Other
18 stars 14 forks source link

Replace `aggregate()` with a streaming API #432

Open cjpatton opened 2 days ago

cjpatton commented 2 days ago

The current aggregation API takes a sequence of output shares and returns an aggregate share. This envisions VDAFs for which the aggregate result depends on the order in which output shares are aggregated, but so far this isn't true of anything we've come up with. In fact, DAP (implicitly?) requires a VDAF for which output shares can be aggregated in any order. (The Leader and Helper may not aggregate output shares in the same order.)

One problem with the current API is that it doesn't align well with practice. Normally what we expect Aggregators to do is aggregate in a "streaming" fashion, meaning output shares are not stored individually, but accumulated into an existing aggregate share as soon as they're ready to be aggregated.

In fact, we'd like to make this "streaming aggregation" normative in DAP: https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/issues/581.

However in order to do this, we'll need to modify the VDAF syntax. There's a few things we could do, but probably the simplest is to split vdaf.aggregate() into the following functions:

tgeoghegan commented 2 days ago

I support this change and the corresponding changes in DAP (https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/issues/581). However I am concerned about the potential for type confusion here. In Prio3, it so happens that output shares, aggregate shares and aggregate results have the same "shape" and thus can be summed together naively to yield semantically meaningful and useful results. But we shouldn't bake that assumption into the abstract VDAF interface, because I don't think it will hold for all VDAFs. Indeed, for Poplar1, aggregate shares and aggregate results already have a different type (FieldVec and list[int], respectively 1).

So I propose adding a little bit more to the VDAF interface to accommodate VDAFs that don't work like Prio3 and to steer implementations away from type confusion.

Some discussion:

Finally, I don't think the "editorial" label is appropriate here: we're discussing some changes to VDAF that would result in changes to implementations.

cjpatton commented 2 days ago

I'm partly with you, but I think your proposal is more complicated than we need.

In Prio3, it so happens that output shares, aggregate shares and aggregate results have the same "shape" and thus can be summed together naively to yield semantically meaningful and useful results. But we shouldn't bake that assumption into the abstract VDAF interface, because I don't think it will hold for all VDAFs. Indeed, for Poplar1, aggregate shares and aggregate results already have a different type (FieldVec and list[int], respectively 1).

For Prio3 and Poplar1, the type of the output share is the same as the type of the aggregate share. It's true that the aggregate result is different for both, but I don't see how this is relevant to this case.

So I propose adding a little bit more to the VDAF interface to accommodate VDAFs that don't work like Prio3 and to steer implementations away from type confusion.

* `vdaf.aggregate_init(agg_param: AggParam) -> IntermediateAggShare` returns an empty intermediate aggregate share. `IntermediateAggShare` is an associated type on the VDAF distinct from `OutShare` or `AggShare`.

* `vdaf.accumulate(agg_param: AggParam, intermediate: IntermediateAggShare, out_share: OutShare) -> IntermediateAggShare` accumulates an output share into an intermediate aggregate share.

* `vdaf.merge(agg_param: AggParam, intermediates: list[IntermediateAggShare]) -> AggShare` merges one or more intermediate aggregate shares into an aggregate share. Aggregate shares may then be unsharded into an aggregate result using the existing `vdaf.unshard` function.

Why do we need to distinguish between IntermedaiteAggShare and AggShare? Why not just do merge(agg_share: AggShare, agg_shares: list[AggShare]) -> AggShare? I'm wary of adding yet another intermediate type to abstract around what amounts to adding vectors in any VDAF we've thought of so far.

* I use the verb "accumulate" instead of "update". I think it's more specific and descriptive, and matches the [existing API in `libprio`](https://docs.rs/prio/latest/prio/vdaf/trait.Aggregatable.html#tymethod.accumulate).

* The verb "merge" also matches the [existing `libprio` API](https://docs.rs/prio/latest/prio/vdaf/trait.Aggregatable.html#tymethod.merge).

Works for me.

Finally, I don't think the "editorial" label is appropriate here: we're discussing some changes to VDAF that would result in changes to implementations.

I removed the label, however note that the change is not wire breaking and I don't believe it's mandatory for an implementation to match the API.

tgeoghegan commented 1 day ago

I can live with:

But the risk is that there may be some future VDAF that won't fit this shape. I'm comfortable accepting that risk.