cambrian / accumulator

Cryptographic accumulators in Rust.
https://cambrian.github.io/accumulator
MIT License
133 stars 34 forks source link

Help me organize the zero allocation U256/U512 code #26

Closed eddiew closed 5 years ago

eddiew commented 5 years ago

because it's currently an 800-line blob of a file. See u256.rs in the zero branch.

A good thing to do would be to deduplicate the U256/U512 shared logic with a macro. The mul instance for U256 can probably be changed to return a U256, and a full_mul function could be added to replace what I currently have.

If you're curious, using the zero-allocation code speeds up the primality checkers by anywhere between 10 and 25%. The runtime can in theory be reduced by another few pct by eliminating intermediate conversions between U256s and U512s, after which I would call it Done™. Updating the multiplication functions should facilitate this.

lucasege commented 5 years ago

Just committed a quick attempt to just throw all common functions into a macro. This cuts file size down significantly, but there are still a couple functions with minor differences between 256/512. These minor differences could probably be addressed by passing more arguments to the macro, but that seems less clear.

For example, both structs define the function 'u256' and 'u512', which are identical in implementation, but the minor name difference would require a third parameter to define the fn name (seems gross), or some dynamic name generation by manipulating strings within the macro (seems unclear).

I haven't gotten to looking into the intermediate conversion optimizations, but will hopefully find time for that later today. Reviews welcome, feel free to change names/move things around as you please.

lucasege commented 5 years ago

RE: "TODO" on FROM<U256>, tried using transmute instead of copy_from_slice, but transmute is about 2x slower (6.9 ns with copy_from_slice, vs. 12.77 ns) for the hash of "data" in U256. copy_from_slice implements memcpy under the hood, so might be good to keep.

eddiew commented 5 years ago

Thanks. Closing this issue now.

Re the memcpy thing, I was wondering if copying the size field separately from the data would be slower than doing it all together. Looks like it is. Maybe transmute is preventing some optimizations from taking place? In either case copying an extra field is not that much slower.