GaloisInc / macaw

Open source binary analysis tools.
BSD 3-Clause "New" or "Revised" License
203 stars 20 forks source link

Add a `MemoryModelContents` option for not initializing writable global data #372

Open langston-barrett opened 7 months ago

langston-barrett commented 7 months ago

The MemoryModelContents datatype determines how writable global data is initialized.

https://github.com/GaloisInc/macaw/blob/7e1694b0ef87e4e431b73d1df14e55d47670e4b1/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs#L66

The Haddock on this type notes that to support sound verification of individual functions in isolation, such data must be initialized as fully symbolic, i.e., an overapproximation of the possible runtime values. However, none of the available options enable sound bug-finding, which would require that globals are initialized to an underapproximation of the possible runtime values. This is infeasible in practice, as the possible values of the global data upon the entry to the target function are unknowable. The next best thing would be to provide an option to leave such data uninitialized, leading any loads from it to fail, which would prevent unsound results (for bug-finding).

RyanGlScott commented 7 months ago

I can conceptualize how this third option would work in the lazy memory model. The lazy memory model carries around an IntervalMap of which addresses have been initialized, and it would be possible to detect a read from an uninitialized address and throw an error.

It's less obvious to me how to implement this in the eager memory model. The eager memory model backs the global address space with a single SMT array, and reads from an SMT array always succeed. Perhaps we'd need to dedicate an extra bit of space to track whether each address's contents are initialized or not?