Performance idea: defer modulo reduction

Some of the papers out there that deal with fields note that you can safely defer modulus reduction in some cases, which can improve performance a little. We're a little way from needing that sort of optimization, but here is a basic idea.

When you multiply or add two fields, we might produce a different type. That bit would include overflowed bits and the type would encode how many bits it needs.

For instance, if we add two Fp31 values (not the 31-bit Mersenne, but the 5-bit Mersenne), addition would produce a type that is backed with a u8 and has an understanding that there are "6 bits" present. On multiplication, we'd have a u16 backing and an understanding that there are 10 bits present. These types can chain, so that ab + cd + e from Fp31 inputs would produce a u16 that believes that it has 12 bits.

These types would not be usable, but could be converted back to the base field. This would occur automatically for some operations (like inversion) and when sending or revealing shares. We would also probably want to convert back for operations that did not have a destination type like in cases where we continuously accumulate (for such cases, we'd probably need an enum type to store an accumulated value). This would allow code to look the same, but values could accumulate without reduction.

This is obviously a ton of work that adds a bunch of type complexity to the code and so it should not be attempted until we get much better performance in other areas of the code.

private-attribution / ipa

Performance idea: defer modulo reduction #937