Closed alexkeizer closed 1 month ago
@alexkeizer Looks good, but I'd like to see a perf comparison to know the cost of figuring out the next PC this way. Could you run that experiment, please?
@alexkeizer Looks good, but I'd like to see a perf comparison to know the cost of figuring out the next PC this way. Could you run that experiment, please?
I ran #time
on gcm_gmult_v8, and it took basically the same time as before (450ms). This checks out: reflectBitVecLiteral
will only do a reduction if the value is not obviously a constant, but our stepi generation will have reduced the value to a constant wherever possible (i.e., for everything besides conditional branches).
So the optimization I suggested would not make much of a difference: all the reflections we could skip by keeping that list are exactly the calls to reflectBitVecLiteral
that are cheap, basically free, already.
@alexkeizer Good to know, and thanks for running the numbers.
Description:
We now attempt to reflect the PC after each step, only falling back to blindly incrementing if reflection failed. With this change,
sym_n
is able to deal with unconditional branches, and I've fixed theAddLoop
example accordingly.The
reflectBitVecValue
used here does involve awhnfD
, which might be a expensive. However, this reduction only happens when the value of the PC is not a literal (i.e., for conditional branches), where reflection of the PC really is necessary. On non-branching instructions, or even unconditional branches, the call should be basically free, so there is little downside to this approach.Testing:
make all
ran locallyLicense:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.