Client-server bandwidth optimization

tl;dr it might be possible to reduce client-server bandwidth to one sixth, with an increase in server-server bandwidth of the same magnitude

Our current plan for managing bandwidth usage has report collectors submit input events to each helper separately. After creating a query, which happens on one helper, the events that are destined for different helpers are sent to that helper directly. However, there is a model in which the client-server bandwidth can be greatly reduced, inspired by a trick that the libprio people have implemented.

In the direct-to-helper mode, each helper receives events that comprise an encrypted blob (containing two shares of a match key and some supplementary stuff), metadata (like site, epoch, and so forth), and shares of breakdown/trigger/timestamp. Shares are replicated.

In a mode where messages go directly to each helper, we could greatly reduce client-server bandwidth by taking two of the three shares and replacing those with shared randomness. This potentially applies to both match keys and the additional shares, but the greatest benefit comes from compressing the additional shares.

The basic idea is that two of the three shares are entirely replaced by a seed for a CSPRNG or PRF. We then define a way to use that RNG to generate values for those shares. For the breakdown/trigger/timestamp/... shares, these all come from the same node, so a single seed is sufficient to supply values for those shares. The final share is set to $x_2=x\oplus f(s_0, j)\oplus f(s_1, j)$, (where $f()$ is the PRF, $s_i$ is the seed that replaces shares of $x_i$, and $j$ is an identifier for this particular share). This final share has to be transmitted in full.

The drawback is that we're using replicated secret sharing. We're still sending two copies of the full share. But if we are willing to have helpers manage distribution of that, we can encrypt the stream of full shares under a key that we make available to the two helpers that will receive that share. The same trick applies to the shared secrets.

That means that - for the shares they submit - report collectors could generate:

$\left\{k{s,0}, k{s,1}, k_{s,2} \right\}$ - symmetric keys for encrypting shared data; these are not sent
$\left\{e{aead}(k{s,0}, s0), e{aead}(k_{s,1}, s1), e{aead}(k_{s,2}, X)\right\}$ - symmetric encryption of the two CSPRG seeds, and then (the big) symmetric encryption of the final set of shares.
$\left\{e{hpke}(k{h,0}, (k{s,0}, k{s,1})), e{hpke}(k{h,1}, (k{s,1}, k{s,2})), e{hpke}(k{h,2}, (k{s,2}, k{s,0}))\right\}$ - public key encryption toward the public keys of each help are submitted, which means that each helper is able to decrypt two of the three values provided (seed or shares)

Relative to what we currently do, this results in a small amount of added fixed overhead. But it means one sixth of the per-record overhead. The cost is that the helper that receives this data needs to send a copy of that large per-record data to another helper node. If we value client-server bandwidth more highly than server-server bandwidth, this is likely a very good trade-off.

This trick could be used for match keys, but the savings are small enough that it is not worthwhile. The CSPRNG seed will be roughly equivalent to the size of the large match key we are considering for use in the OPRF (32 bytes). That's larger than the 40-bit value we are currently using, even with the doubling. That means that the only potential saving comes if we are using the larger OPRF, because we would avoid the cost of sending the two shares to each helper, but half of those costs are cancelled by the need to encrypt the keys separately, which costs 16 bytes extra.

patcg-individual-drafts / ipa

Client-server bandwidth optimization #81