private-attribution / ipa

A raw implementation of Interoperable Private Attribution
MIT License
42 stars 25 forks source link

Further optimize `intermediates_to_table_indices` #1457

Open andyleiserson opened 4 days ago

andyleiserson commented 4 days ago

intermediates_to_table_indices works as follows:

It appears that bits_to_table_indices compiles to <200 instructions (fully unrolled with no loops or branches), while the rearranging of nibbles compiles to >1000 instructions (again, fully unrolled with no loops or branches). Implementing a single transpose-like operation covering both steps would probably be more efficient.