GaloisInc / macaw

Open source binary analysis tools.
BSD 3-Clause "New" or "Revised" License
208 stars 21 forks source link

`macaw-symbolic`: Inconsistent reporting of memory writes to read-only sections of memory #418

Open RyanGlScott opened 3 months ago

RyanGlScott commented 3 months ago

The default implementation of mkGlobalPointerValidityPred (here) is designed to check if you are attempting a memory write to a read-only section of the global address space, and if so, it will throw an assertion failure. I've discovered that this only works some of the time, however. To see what I mean, let's cook up a small test case that demonstrates writing to a read-only part of memory:

// test.c

int x = 0;

__attribute__((noinline)) int test_and_verify_set_x(void) {
  x = 1;
  return x;
}

void _start(void) {
  test_and_verify_set_x();
}

Writing to x is fine on its own, but I will use objcopy to mark the ELF section that x resides in as read-only (as though x were a constant):

$ ~/Software/powerpc-linux-musl-cross/bin/powerpc-linux-musl-gcc -nostdlib -fno-stack-protector -fcf-protection=none test.c -o test.exe && ~/Software/powerpc-linux-musl-cross/bin/powerpc-linux-musl-objcopy --writable-text --set-section-flags .sbss=readonly test.exe

(I am using a musl-based, PPC32 version of gcc in the example above, but the same principle should apply to any architecture.)

Let's test this out. The easier way I know how to test this is to make the following change to macaw-symbolic:

diff --git a/symbolic/src/Data/Macaw/Symbolic/Testing.hs b/symbolic/src/Data/Macaw/Symbolic/Testing.hs
index 6203e531..85d547aa 100644
--- a/symbolic/src/Data/Macaw/Symbolic/Testing.hs
+++ b/symbolic/src/Data/Macaw/Symbolic/Testing.hs
@@ -514,7 +514,7 @@ simulateFunction binfo bak execFeatures archInfo archVals halloc mmPreset g = do
         let mmConf = (MSM.memModelConfig bak memPtrTbl)
                        { MS.lookupFunctionHandle = lookupFunction archVals binfo
                        , MS.lookupSyscallHandle = lookupSyscall
-                       , MS.mkGlobalPointerValidityAssertion = validityCheck
+                       -- , MS.mkGlobalPointerValidityAssertion = validityCheck
                        }
         pure (initMem, mmConf)
       LazyMemModel -> do
@@ -524,7 +524,7 @@ simulateFunction binfo bak execFeatures archInfo archVals halloc mmPreset g = do
         let mmConf = (MSMLazy.memModelConfig bak memPtrTbl)
                        { MS.lookupFunctionHandle = lookupFunction archVals binfo
                        , MS.lookupSyscallHandle = lookupSyscall
-                       , MS.mkGlobalPointerValidityAssertion = validityCheck
+                       -- , MS.mkGlobalPointerValidityAssertion = validityCheck
                        }
         pure (initMem, mmConf)
   (stackBasePtr, mem1) <- CLM.doMalloc bak CLM.StackAlloc CLM.Mutable "stack_alloc" initMem stackSize CLD.noAlignment

Then compile macaw-{ppc,x86}-symbolic, and then run this test harness program (based on the macaw-{ppc,x86}-symbolic test suites):

```hs -- Bug.hs {-# LANGUAGE DataKinds #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE ImplicitParams #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE TypeOperators #-} module Main (main) where import Control.Lens ((^.)) import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS8 import qualified Data.ElfEdit as Elf import qualified Data.Foldable as F import qualified Data.Map as Map import Data.Maybe ( mapMaybe ) import qualified Data.Parameterized.Classes as PC import qualified Data.Parameterized.Nonce as PN import Data.Parameterized.Some ( Some(..) ) import Data.Proxy ( Proxy(..) ) import GHC.TypeLits ( KnownNat ) import qualified What4.Config as WC import qualified What4.Expr as WE import qualified What4.Expr.Builder as WEB import qualified What4.FloatMode as WF import qualified What4.Interface as WI import qualified What4.ProblemFeatures as WPF import qualified What4.Solver as WS import qualified Lang.Crucible.Backend as CB import qualified Lang.Crucible.Backend.Online as CBO import qualified Lang.Crucible.Simulator as CS import qualified Lang.Crucible.LLVM.MemModel as LLVM import qualified Dismantle.PPC as DP import qualified SemMC.Architecture.PPC as SAP import qualified Data.Macaw.CFG as MC import qualified Data.Macaw.Discovery as M import qualified Data.Macaw.Memory as MM import qualified Data.Macaw.Memory.ElfLoader.PLTStubs as MMEP import qualified Data.Macaw.Symbolic as MS import qualified Data.Macaw.Symbolic.Testing as MST import qualified Data.Macaw.PPC as MP import Data.Macaw.PPC.Symbolic () import qualified Data.Macaw.X86 as MX import Data.Macaw.X86.Symbolic () main :: IO () main = do -- These are pass/fail in that the assertions in the "pass" set are true (and -- the solver returns Unsat), while the assertions in the "fail" set are false -- (and the solver returns Sat). let passRes = MST.SimulationResult MST.Unsat mkSymExTest passRes MST.LazyMemModel "test.exe" hasTestPrefix :: Some (M.DiscoveryFunInfo arch) -> Maybe (BS8.ByteString, Some (M.DiscoveryFunInfo arch)) hasTestPrefix (Some dfi) = do bsname <- M.discoveredFunSymbol dfi if BS8.pack "test_" `BS8.isPrefixOf` bsname then return (bsname, Some dfi) else Nothing mkSymExTest :: MST.SimulationResult -> MST.MemModelPreset -> FilePath -> IO () mkSymExTest expected mmPreset exePath = do bytes <- BS.readFile exePath case Elf.decodeElfHeaderInfo bytes of Left (_, msg) -> fail ("Error parsing ELF header from file '" ++ show exePath ++ "': " ++ msg) Right (Elf.SomeElf ehi) -> do case (Elf.headerClass (Elf.header ehi), Elf.headerMachine (Elf.header ehi)) of (Elf.ELFCLASS64, Elf.EM_X86_64) -> do binariesInfo <- MST.runDiscovery ehi exePath MST.toAddrSymMap MX.x86_64_linux_info MX.x86_64PLTStubInfo let funInfos = Map.elems (MST.binaryDiscState (MST.mainBinaryInfo binariesInfo) ^. M.funInfo) let testEntryPoints = mapMaybe hasTestPrefix funInfos F.forM_ testEntryPoints $ \(_name, Some dfi) -> do Some (gen :: PN.NonceGenerator IO t) <- PN.newIONonceGenerator sym <- WEB.newExprBuilder WF.FloatRealRepr WE.EmptyExprBuilderState gen CBO.withYicesOnlineBackend sym CBO.NoUnsatFeatures WPF.noFeatures $ \bak -> do -- We are using the z3 backend to discharge proof obligations, so -- we need to add its options to the backend configuration let solver = WS.z3Adapter let backendConf = WI.getConfiguration sym WC.extendConfig (WS.solver_adapter_config_options solver) backendConf execFeatures <- MST.defaultExecFeatures (MST.SomeOnlineBackend bak) archVals <- case MS.archVals (Proxy @MX.X86_64) Nothing of Just archVals -> pure archVals Nothing -> error "mkSymExTest: impossible" let extract = x86ResultExtractor archVals let ?memOpts = LLVM.defaultMemOptions simRes <- MST.simulateAndVerify solver WS.defaultLogData bak execFeatures MX.x86_64_linux_info archVals binariesInfo mmPreset extract dfi if expected == simRes then putStrLn "Test passed" else fail $ "Expected " ++ show expected ++ ", but got " ++ show simRes (Elf.ELFCLASS32, Elf.EM_PPC) -> do binariesInfo <- MST.runDiscovery ehi exePath MST.toAddrSymMap MP.ppc32_linux_info (MMEP.noPLTStubInfo "PPC") let funInfos = Map.elems (MST.binaryDiscState (MST.mainBinaryInfo binariesInfo) ^. M.funInfo) let testEntryPoints = mapMaybe hasTestPrefix funInfos F.forM_ testEntryPoints $ \(_name, Some dfi) -> do Some (gen :: PN.NonceGenerator IO t) <- PN.newIONonceGenerator sym <- WEB.newExprBuilder WF.FloatRealRepr WE.EmptyExprBuilderState gen CBO.withYicesOnlineBackend sym CBO.NoUnsatFeatures WPF.noFeatures $ \bak -> do -- We are using the z3 backend to discharge proof obligations, so -- we need to add its options to the backend configuration let solver = WS.z3Adapter let backendConf = WI.getConfiguration sym WC.extendConfig (WS.solver_adapter_config_options solver) backendConf execFeatures <- MST.defaultExecFeatures (MST.SomeOnlineBackend bak) archVals <- case MS.archVals (Proxy @MP.PPC32) Nothing of Just archVals -> pure archVals Nothing -> error "mkSymExTest: impossible" let extract = ppcResultExtractor archVals let ?memOpts = LLVM.defaultMemOptions simRes <- MST.simulateAndVerify solver WS.defaultLogData bak execFeatures MP.ppc32_linux_info archVals binariesInfo mmPreset extract dfi if expected == simRes then putStrLn "Test passed" else fail $ "Expected " ++ show expected ++ ", but got " ++ show simRes _ -> error "Unsupported architecture" x86ResultExtractor :: (CB.IsSymInterface sym) => MS.ArchVals MX.X86_64 -> MST.ResultExtractor sym MX.X86_64 x86ResultExtractor archVals = MST.ResultExtractor $ \regs _sp _mem k -> do let re = MS.lookupReg archVals regs MX.RAX k PC.knownRepr (CS.regValue re) ppcResultExtractor :: ( arch ~ MP.AnyPPC v , CB.IsSymInterface sym , MP.KnownVariant v , MM.MemWidth (SAP.AddrWidth v) , MC.ArchConstraints arch , MS.ArchInfo arch , KnownNat (SAP.AddrWidth v) ) => MS.ArchVals arch -> MST.ResultExtractor sym arch ppcResultExtractor archVals = MST.ResultExtractor $ \regs _sp _mem k -> do let re = MS.lookupReg archVals regs (MP.PPC_GP (DP.GPR 3)) k PC.knownRepr (CS.regValue re) ```

If you do all that, you should get the following error:

$ ~/Software/powerpc-linux-musl-cross/bin/powerpc-linux-musl-gcc -nostdlib -fno-stack-protector -fcf-protection=none test.c -o test.exe && ~/Software/powerpc-linux-musl-cross/bin/powerpc-linux-musl-objcopy --writable-text --set-section-flags .sbss=readonly test.exe
$ runghc Bug.hs
Bug.hs: Failed to prove goal: test.exe: segment1+0x1e0: error: in test_and_verify_set_x
PointerWrite outside of static memory range (known BlockID 0): let -- test.exe: segment1+0x1d8
    v442 = select cglobalMemoryBytes@1:a 0x1ff6b:[32]
    -- test.exe: segment1+0x1d8
    v447 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff6a:[32]) v442
    -- test.exe: segment1+0x1d8
    v452 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff69:[32]) v447
 in bvConcat (select cglobalMemoryBytes@1:a 0x1ff68:[32]) v452
CallStack (from HasCallStack):
  error, called at src/Data/Macaw/Symbolic/Testing.hs:338:33 in macaw-symbolic-0.0.1-inplace:Data.Macaw.Symbolic.Testing

So far, so good.

The example above used PPC32, however. What happens if we use an x86-64 version of gcc instead?

$ gcc -nostdlib -fno-stack-protector -fcf-protection=none test.c -o test.exe && objcopy --writable-text --set-section-flags .bss=readonly test.exe
$ runghc Bug.hs
Test passed

Huh? How come the x86-64 version passes the test, but the PPC32 version fails? There is a hint in this part of mkGlobalPointerValidityPredCommon:

https://github.com/GaloisInc/macaw/blob/3e83b3eff3067c151f65f44c1e64021e36ad935a/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs#L168-L177

Note that this will only attempt to emit an outside of static memory range assertion failure if the LLVMPtr representing the global address has a block number of zero. If it has a non-zero block number, then there is no chance of an assertion failure. And sure enough, if we add some additional debug-printing to this function:

diff --git a/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs b/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs
index d72ef96d..ac3094d6 100644
--- a/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs
+++ b/symbolic/src/Data/Macaw/Symbolic/Memory/Common.hs
@@ -167,6 +167,10 @@ mkGlobalPointerValidityPredCommon tbl = \sym puse mcond ptr -> do

   let ptrVal = CS.regValue ptr
   let (ptrBase, ptrOff) = CL.llvmPointerView ptrVal
+  putStrLn $ unlines
+    [ "RGS mkGlobalPointerValidityPredCommon"
+    , show $ CL.ppPtr ptrVal
+    ]
   case WI.asNat ptrBase of
     Just 0 -> do
       p <- mkPred ptrOff

And then if we re-run the x86-64 version of the program, we see:

$ runghc Bug.hs
<snip>

RGS mkGlobalPointerValidityPredCommon
(1, 0x4000:[64])

<snip>

Test passed

In the x86-64 version of the binary, 0x4000 is x's address. Here, we see that the block number is 1, so macaw-symbolic won't check if the offset is out of range.

On the other hand, if we run the PPC32 version of the program, we see:

$ runghc Bug.hs
<snip>

RGS mkGlobalPointerValidityPredCommon
let -- test.exe: segment1+0x1d8
    v442 = select cglobalMemoryBytes@1:a 0x1ff6b:[32]
    -- test.exe: segment1+0x1d8
    v447 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff6a:[32]) v442
    -- test.exe: segment1+0x1d8
    v452 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff69:[32]) v447
 in bvConcat (select cglobalMemoryBytes@1:a 0x1ff68:[32]) v452

<snip>

Bug.hs: Failed to prove goal: test.exe: segment1+0x1e0: error: in test_and_verify_set_x
PointerWrite outside of static memory range (known BlockID 0): let -- test.exe: segment1+0x1d8
    v442 = select cglobalMemoryBytes@1:a 0x1ff6b:[32]
    -- test.exe: segment1+0x1d8
    v447 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff6a:[32]) v442
    -- test.exe: segment1+0x1d8
    v452 = bvConcat (select cglobalMemoryBytes@1:a 0x1ff69:[32]) v447
 in bvConcat (select cglobalMemoryBytes@1:a 0x1ff68:[32]) v452
CallStack (from HasCallStack):
  error, called at src/Data/Macaw/Symbolic/Testing.hs:338:33 in macaw-symbolic-0.0.1-inplace:Data.Macaw.Symbolic.Testing

This time, we see that the block number is 0 (by virtue of the fact that it's not printing a pair).

What is causing this discrepancy? Ultimately, it's because when simulating the x86-64 version of the program, macaw-symbolic attempts to write to an address that was resolved using MacawGlobalPtr. This will translate an absolute address (e.g., 0x4000 with a block number of 0) into an offset from the LLVMPtr that backs global memory (e.g., (1, 0x4000)). When simulating the PPC32 version of the program, however, macaw-symbolic attempts to write to an absolute address (0x2000) that was not resolved using MacawGlobalPtr. (I'm very unclear on why the two architectures differ in this way.)

What should we do about this? We could investigate why the two architectures differ here, although I'm unsure how deep that rabbit hole goes. In principle, mkGlobalPointerValidityPredCommon ought to be able to handle addresses that were resolved using MacawGlobalPtr, so perhaps we should teach it to also make assertions about LLVMPtrs whose block number matches that of the LLVMPtr backing global memory.

RyanGlScott commented 3 months ago

See https://github.com/GaloisInc/macaw/issues/313 for a similar problem affecting other pointer operations (e.g., PtrEq).