Remove iterators and boxing from `recursively_compute_final_check`

private-attribution / ipa

A raw implementation of Interoperable Private Attribution

MIT License

39 stars 23 forks source link

Remove iterators and boxing from `recursively_compute_final_check` #1173

Closed akoshelev closed 3 weeks ago

akoshelev commented 3 weeks ago

As we discussed in #1166, the way iterators are boxed to achieve the desired result of recursively exhaust them and build the final result made code look more complex that otherwise it should be.

This is a proposal to simplify this implementation by:

re-using the u and v allocations for compression algorithm
using a more straightforward chunks method that is available on vectors
removing custom arraychunks implementation
Simplifying trait bounds for functions in this module

codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.92%. Comparing base (86d0178) to head (52b3d6e).

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1173 +/- ## ========================================== - Coverage 91.94% 91.92% -0.03% ========================================== Files 195 194 -1 Lines 28929 28900 -29 ========================================== - Hits 26600 26566 -34 - Misses 2329 2334 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

akoshelev commented 3 weeks ago

For the first iteration, we probably do not want to collect the u or v in a vector before compressing them since that would take around 670 Megabytes in memory for 50 million bit multiplications. The current changes would make it necessary to collect the initial iterator over u,v values in a vector.

This change does not require collecting iterators into vectors, it simply re-uses the existing allocation for u and v values

akoshelev commented 3 weeks ago

I synced with @danielmasny and he pointed me to the new PR that actually changes how things are passed into this function. It won't be a vec anymore, but an iterator that yields field values for all bits stored in a batch. So it is definitely suboptimal to collect all 50M field values into a vec.

I don't think this PR makes sense