Closed jan-ferdinand closed 1 month ago
Looks very good. Two observations:
1) There are some benchmark results that show surprising changes.
2) Instead of restricting values generated for environment_args
in pseudorandom_initial_state
, would it make sense to rewrite the test containing an assert that there was no overflow (on a u128
addition) not use the .test()
method but let it set up its environment in a more manual way?
There are some benchmark results that show surprising changes.
True. I guess that pertains mainly to tasmlib_list_higher_order_u32_map_test_hash_xfield_element
. Could it be explained by different test input being generated? Since the input state generator has to accomodate multiple lists, I rewrote it, re-ordering the operations and arguments to the various rng::gen()
calls.
[W]ould it make sense to rewrite the test [to] not use the .test() method but let it set up its environment in a more manual way?
Potentially yes. The one thing I tried was accessing ShadowedFunction::test_initial_state()
, but that's a private method (and I'm fine with that). What would be a good way to set up a test in this way?
Please note that I rebased this branch against master
in order to be able to call the test_rust_equivalence_given_execution_state
test helper function such that I could rewrite the test case that forced you to do shady things in the pseudorandom_initial_state
trait function.
Apart from your suggestions, I also made one more addition that might be worth reviewing: the type ChainMap
now declares an associated constant NUM_INTERNAL_REGISTERS
, with which the required offset for any mapping function requiring runtime parameters from deeper on the stack can be accessed programatically. The intention is to
chain_map
internals, andChainMap::<1>
, we can now do so without having to touch all snippets that use Map
.Let me know if you have any suggestions to improve this design.
some benchmark results that show surprising changes
Is the change to the initial state generator a likely explanation? If so, are we fine with the benchmark changing more dramatically than the changes to the assembly alone would suggest?
some benchmark results that show surprising changes
Is the change to the initial state generator a likely explanation? If so, are we fine with the benchmark changing more dramatically than the changes to the assembly alone would suggest?
Yes, we are fine with that. But with the suggested change of using the bench
input parameter to set the list length (a suggestion you already implemented), we can trust the benchmarks of Map
going forward. That's enough for me.
Other benchmarks reveal the performance changes to Map
resulting from this PR, and those changes look positive (positive in the value sense, i.e. a decrese in the row counts).
Previously, the higher-order assembly function
map
took all elements of an input list, mapped them according to some supplied function, and generated an output list. Equivalently, written in rust, this was:list.into_iter().map(f).collect_vec()
.Now,
map
is generalized over the number of input lists, enabling chaining multiple input lists while still producing only one output list. As rust:list_0.into_iter().chain(list_1).map(f).collect_vec()
.Between 0 and 15 (both inclusive) input lists are supported.
Note: the ABI for map has changed. Before, an inner function could access runtime parameters on the stack at a distance of 3. Now, this distance has increased to (3 + num_input_lists). For a “traditional” map, this is now 4.