zkvm: Remove unnecessary copies in miller loop and `sum_of_products`

This PR removes unnecessary copies in the following hot paths:

Fp::sum_of_products: remove the copies of a and b by using iter_mut() and iter(), also use out in-place
Miller loop: both doubling_step and addition_step were rewritten to use the in-place operations and minimize the number of copies
Some other minor miller loop copies were removed by using references in the MillerLoopDriver trait

This is achieved by adding zkvm-specific variants of base operations that modify a value in-place via &mut self to prevent unnecessary copies. Then functions in the hot-path are modified (adding a zkvm-specific variant) to use these operations instead where possible. We make a new function and use #[cfg] to select between the zkvm no-copy version and the regular version, to make it easier to compare implementations and ensure the functions are equivalent.

This brings the aptos-lc ratcheting test down from 22289711 cycles to 15933247 cycles (~29% reduction)

The other hot-paths that will be optimized in future follow-up PRs are:

Removing copies in the final_exponentiation function and the functions it calls. This is currently taking around ~6M cycles
hash_to_curve, specifically the G2::mul_by_x() operation could use G2 affine/double precompiles, which should save ~1M cycles

lurk-lab / bls12_381

zkvm: Remove unnecessary copies in miller loop and `sum_of_products` #11