Replace `aggregate()` with a streaming API

The current aggregation API takes a sequence of output shares and returns an aggregate share. This envisions VDAFs for which the aggregate result depends on the order in which output shares are aggregated, but so far this isn't true of anything we've come up with. In fact, DAP (implicitly?) requires a VDAF for which output shares can be aggregated in any order. (The Leader and Helper may not aggregate output shares in the same order.)

One problem with the current API is that it doesn't align well with practice. Normally what we expect Aggregators to do is aggregate in a "streaming" fashion, meaning output shares are not stored individually, but accumulated into an existing aggregate share as soon as they're ready to be aggregated.

In fact, we'd like to make this "streaming aggregation" normative in DAP: https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/issues/581.

However in order to do this, we'll need to modify the VDAF syntax. There's a few things we could do, but probably the simplest is to split vdaf.aggregate() into the following functions:

vdaf.agg_init() -> AggShare: returns an empty aggregate share.
vdaf.agg_update(agg_share: AggShare, out_share) -> AggShare: updates an aggregate share with the incoming output share.

I support this change and the corresponding changes in DAP (https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/issues/581). However I am concerned about the potential for type confusion here. In Prio3, it so happens that output shares, aggregate shares and aggregate results have the same "shape" and thus can be summed together naively to yield semantically meaningful and useful results. But we shouldn't bake that assumption into the abstract VDAF interface, because I don't think it will hold for all VDAFs. Indeed, for Poplar1, aggregate shares and aggregate results already have a different type (FieldVec and list[int], respectively 1).

So I propose adding a little bit more to the VDAF interface to accommodate VDAFs that don't work like Prio3 and to steer implementations away from type confusion.

vdaf.aggregate_init(agg_param: AggParam) -> IntermediateAggShare returns an empty intermediate aggregate share. IntermediateAggShare is an associated type on the VDAF distinct from OutShare or AggShare.
vdaf.accumulate(agg_param: AggParam, intermediate: IntermediateAggShare, out_share: OutShare) -> IntermediateAggShare accumulates an output share into an intermediate aggregate share.
vdaf.merge(agg_param: AggParam, intermediates: list[IntermediateAggShare]) -> AggShare merges one or more intermediate aggregate shares into an aggregate share. Aggregate shares may then be unsharded into an aggregate result using the existing vdaf.unshard function.

Some discussion:

I believe all these associated functions need to take the aggregation parameter, as vdaf.aggregate and vdaf.unshard already do.
An intermediate aggregate share is what DAP implementations would construct for each "batch bucket", as discussed in https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/issues/581.
I use the verb "accumulate" instead of "update". I think it's more specific and descriptive, and matches the existing API in libprio.
The verb "merge" also matches the existing libprio API.
We could do away with vdaf.aggregate if we do the rest of this stuff, but we could also keep it for use cases besides streaming DAP.

Finally, I don't think the "editorial" label is appropriate here: we're discussing some changes to VDAF that would result in changes to implementations.

I'm partly with you, but I think your proposal is more complicated than we need.

In Prio3, it so happens that output shares, aggregate shares and aggregate results have the same "shape" and thus can be summed together naively to yield semantically meaningful and useful results. But we shouldn't bake that assumption into the abstract VDAF interface, because I don't think it will hold for all VDAFs. Indeed, for Poplar1, aggregate shares and aggregate results already have a different type (FieldVec and list[int], respectively 1).

For Prio3 and Poplar1, the type of the output share is the same as the type of the aggregate share. It's true that the aggregate result is different for both, but I don't see how this is relevant to this case.

So I propose adding a little bit more to the VDAF interface to accommodate VDAFs that don't work like Prio3 and to steer implementations away from type confusion.

* `vdaf.aggregate_init(agg_param: AggParam) -> IntermediateAggShare` returns an empty intermediate aggregate share. `IntermediateAggShare` is an associated type on the VDAF distinct from `OutShare` or `AggShare`.

* `vdaf.accumulate(agg_param: AggParam, intermediate: IntermediateAggShare, out_share: OutShare) -> IntermediateAggShare` accumulates an output share into an intermediate aggregate share.

* `vdaf.merge(agg_param: AggParam, intermediates: list[IntermediateAggShare]) -> AggShare` merges one or more intermediate aggregate shares into an aggregate share. Aggregate shares may then be unsharded into an aggregate result using the existing `vdaf.unshard` function.

Why do we need to distinguish between IntermedaiteAggShare and AggShare? Why not just do merge(agg_share: AggShare, agg_shares: list[AggShare]) -> AggShare? I'm wary of adding yet another intermediate type to abstract around what amounts to adding vectors in any VDAF we've thought of so far.

* I use the verb "accumulate" instead of "update". I think it's more specific and descriptive, and matches the [existing API in `libprio`](https://docs.rs/prio/latest/prio/vdaf/trait.Aggregatable.html#tymethod.accumulate).

* The verb "merge" also matches the [existing `libprio` API](https://docs.rs/prio/latest/prio/vdaf/trait.Aggregatable.html#tymethod.merge).

Works for me.

Finally, I don't think the "editorial" label is appropriate here: we're discussing some changes to VDAF that would result in changes to implementations.

I removed the label, however note that the change is not wire breaking and I don't believe it's mandatory for an implementation to match the API.

I can live with:

vdaf.aggregate_init(agg_param: AggParam) -> AggShare
vdaf.accumulate(agg_param: AggParam, share: AggShare, out_share: OutShare) -> AggShare
vdaf.merge(agg_param: AggParam, shares: list[AggShare]) -> AggShare

But the risk is that there may be some future VDAF that won't fit this shape. I'm comfortable accepting that risk.

cfrg / draft-irtf-cfrg-vdaf

Replace `aggregate()` with a streaming API #432