Open LeventErkok opened 1 year ago
Thanks for the pointer! I was vaguely aware of lambda
in the context of Yices, but I wasn't aware that Z3 supported it as well—neat!
I'm glad that we wrote up that post, as it has revealed that we are likely still leaving performance on the table. There are two other alternative memory models that we should try:
lambda
term, as your propose above.arrayFromFn
, for example. (Despite the name, arrayFromFn
doesn't actually use an SMT array under the hood.)Both of these options would be appealing because they could be eagerly constructed before the start of simulation, which would avoid the extra bookkeeping of the lazy macaw-symbolic
memory model. We should actually benchmark these to see if they are indeed faster, of course.
SBV used to have what I called "functional" arrays. Which were simply (psuedo-code):
type A a b = a -> b
read f a = f a
write f a v a' = ite (a' .== a) v (f a')
Note how this keeps everything on the Haskell side. But they had terrible performance compared to SMTLib arrays: Once you did a small number of write
's, it built up this huge chain of ite
calls, which caused a ton of split points. I tried to optimize it by keeping a cache on the Haskell side to speed-up concrete address read/writes, but it got too complicated and wasn't worth it. I removed it all and now exclusively use SMTLib arrays as they have the best performance, and lambdaAsArray
gives the flexibility of using a function as is.
I suspect What4's arrayFromFn
might be working in a similar way. I'd be surprised if it performed well. But maybe you have some other tricks to make it go fast.
This is useful data to have, thanks! Indeed, I suspect that arrayFromFn
would likely not be as fast overall due to the sheer number of ite
s involved, but it's hard to say without actually trying it. (My intuition about SMT solver performance rarely turns out to be correct.)
This is not an issue, but rather a comment on https://galois.com/blog/2023/03/making-a-scalable-smt-based-machine-code-memory-model/
(I'd have put a comment on the blog, but that doesn't seem possible. In any case, GitHub is probably better suited for this discussion anyhow.)
Cool blog! It reminded me about some of the more recent advances in z3 on how large symbolic memories can be modeled. As you mention, even though the memory is excessively large, any given program only accesses a small part of it, at least for most programs of interest. To address these concerns, z3 now allows functions (expressed as lambdas) to be used as arrays. This makes a lot of sense: A memory is nothing but a mapping from addresses to values, which is exactly what a function is. With some internal solver magic, a lot of the clunky SMT-array updates can be avoided.
I've also recently added support for this in SBV, with the function
lambdaAsArray
. The idea is that you create a lambda-term (which is a z3 extension, I don't think other solvers support it quite yet), encoding the memory as a function.Here's an example I coded up that should illustrate the point:
(Note that you need sbv 10.1 to have this example run, which just got released on Hackage a few days ago.)
I'll let you look into the details, but the crucial point is we get to initialize the memory to not just the typical "all 0", but with a function that stores an arbitrary value in any address, which can be a function of that address. (See the definition of
initMem
.) If you were doing this with a regular array, you'd have an infinite conjunction, or you'd have to use a quantifier; neither of which play all that well. But with a lambda-term, it behaves quite nicely. You can runexample1
to see what a small foot-print it has as it communicates to the solver; the trick is revealed in the SMT-Lib line:which shows the
lambda
extension allowed by z3.In
example2
, we do load a program into this memory, and do some queries on it. Again, the footprint is quite small. You can see it by running it in the verbose mode. (runSMTWith z3{verbose=True} ...
)Anyhow. I wanted to share this cool z3 feature with you. It'd be interesting to see if you can incorporate this idea to Macaw and see how it fares against your solution. Of course, when
lambda
terms are around, z3 no longer guarantees decidability: You can getunknown
as an answer (or worse, loop forever); but I think if you use it in this way, it should get you far. (They also allow lambdas with sequence-folding etc., which can land you into the land of undecidability.) I'd be interested in hearing what you find out, if you play around with this.Great to see Galois is pushing the frontiers on this sort of analysis..