Open RyanGlScott opened 3 weeks ago
For reference, here is the definition of HasLLVMAnn
:
The parameter ?recordLLVMAnnotation
is usually used in the following way:
Pred sym
, representing a safety condition, is annotated with What4.Interface.annotateTerm
BadBehavior
is constructed that represents the meaning of the safety condition?recordLLVMAnnotation
is called with these arguments, it generally stashes the (annotation, BadBehavior
) pair into a MapF
that lives in an IORef
, where it can later be retrieved for analysisBadBehavior
from the map and analyze it.I would actually suggest we consider supporting an API in Macaw that diverges slightly from this scheme, to potentially support more different use-cases. I have a particular proposal below, but just generally think we should consider alternatives.
We could create an enumeration (perhaps MacawError
) of all (or at first, most) of the assertions that Macaw makes, and then provide an interface like so:
data MacawError sym where
divByZero :: forall w. (1 <= w) => SymExpr (BVType w) -> MacawError sym
-- more constructors here ...
-- | Given a safety predicate and a description of the error it represents,
-- return a new predicate (and possibly perform additional side-effects, such as
-- recording information about the predicate).
type MacawProcessAssertion sym
= (?processAssert :: sym -> Pred sym -> MacawError sym -> IO (Pred sym))
With this scheme, the MacawProcessAssertion sym
could still support the existing use-case of recording information about Pred sym
s to be used in later analysis, but it has more power:
It has access to the Pred sym
itself, not just an annotation on it, so
Pred
is a constant before deciding what to do nextIt can change the predicate, e.g., to skip making certain classes of assertions by returning truePred sym
... and potentially other use-cases not yet envisioned here. A default implementation might be:
defaultMacawProcessAssertion _sym p e = pure p
Or, to support the existing use-case (not 100% sure this type-checks):
defaultMacawProcessAssertion mapRef sym p e = do
(n, p') <- annotateTerm sym p
modifyIORef mapRef (MapF.insert n e)
pure p'
macaw-symbolic
is built on top of thecrucible-llvm
memory model, which means that it has the ability to report instances of C-oriented bad behavior (assuming that the underlying machine code adheres to C's memory model conventions). These instances of bad behavior are tracked via theHasLLVMAnn
constraint that is threaded throughoutmacaw-symbolic
.In addition to C memory model checks,
macaw-symbolic
also adds a variety of additional assertions that are specific to machine code. These include (but are likely not limited to):Checking if reads from and writes to the global address space are within bounds:
https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs#L172-L175 https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs#L181-L187
Using a default value when a conditional read's condition does not hold:
https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/symbolic/src/Data/Macaw/Symbolic/MemOps.hs#L1095-L1099
A variety of checks when performing memory operations involving pointers:
https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/symbolic/src/Data/Macaw/Symbolic/MemOps.hs#L502-L506
The
attn
signal (inmacaw-ppc-symbolic
):https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/macaw-ppc-symbolic/src/Data/Macaw/PPC/Symbolic/Functions.hs#L82-L84
Loss of floating-point precision (in
macaw-ppc-symbolic
):https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/macaw-ppc-symbolic/src/Data/Macaw/PPC/Symbolic/Functions.hs#L235-L246
Dividing by zero in a division-related instruction (in
macaw-x86-symbolic
):https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/x86_symbolic/src/Data/Macaw/X86/Crucible.hs#L725-L729
A quotient overflowing in a division-related instruction (in
macaw-x86-symbolic
):https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/x86_symbolic/src/Data/Macaw/X86/Crucible.hs#L767-L769 https://github.com/GaloisInc/macaw/blob/186c35da7c35e303ea3e06e026cdf8f0b375ecb5/x86_symbolic/src/Data/Macaw/X86/Crucible.hs#L807-L809
Unlike the checks in the
crucible-llvm
memory model, however, these assertions are all performed viaGenericSimError
orAssertFailureSimError
. As a result, it is not straightforward to catch these assertions and perform subsequent analysis on them after simulation fails.I propose that
macaw-symbolic
add a constraint similar toHasLLVMAnn
(perhapsHasMacawAnn
, for lack of a better name) and use it to track the What4 annotations of each of the terms that give rise to these assertion failures. That way, one can consult theMap
of badmacaw-symbolic
behaviors afterwards and match the annotations to the corresponding terms. This would require a fair bit of API churn in order to thread the newHasMacawAnn
constraint through to the relevant functions, however.