GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
437 stars 63 forks source link

`write_aig` consumes very large amounts of memory on some inputs #1826

Open bboston7 opened 1 year ago

bboston7 commented 1 year ago

On a machine with 20GB of memory, write_aig chews through all 20GB in minutes and is killed by the OOM killer on some inputs. The Dilithium Cryptol specs are a reliable way to trigger this behavior. To reproduce, run the following SAWScript against the Dilithium specs:

import "DilithiumRecommended.cry";                                              
write_aig "hash.aig" {{ hash }}; 
RyanGlScott commented 1 year ago

Heap profiling suggests that this is a space leak in aig's and function:

  and g x y =
    do let l = max x y
           r = min x y
       gateMap <- readIORef (andMap g)
       case Bimap.lookupR (l, r) gateMap of
         Nothing ->
           do v <- newVar g
              writeIORef (andMap g) $ Bimap.insert v (l, r) gateMap
              return (varToLit v)
         Just v -> return (varToLit v)

Blasting gateMap with rnf avoids the space leak (at the cost of O(n^2) additional time), so I need to figure out a way to avoid building up the thunks in gateMap.

Another thing that is worth noting is that this code is not as efficient as it could be, as Bimap.insert k1 k2 will first delete any previous k1 or k2 entries in the maps before inserting a new (k1, k2) entry. However, in the place where Bimap.insert v (l, r) is used above, we know that v and (l, r) do not yet exist in the maps, so the deletion operation performs unneeded work. Unfortunately, I'm unclear if the bimap library offers a more efficient version of insert that bypasses the deletion step. (I'm unclear if this relates to the space leak, but it does stand out to me as a possible improvement.)

RyanGlScott commented 1 year ago

The following patch to aig suffices to fix the space leaks that appear in info table profiling reports:

diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs
index ecc075e..0daff3c 100644
--- a/src/Data/AIG/CompactGraph.hs
+++ b/src/Data/AIG/CompactGraph.hs
@@ -421,19 +425,16 @@ instance IsAIG CompactLit CompactGraph where
        return (varToLit v)

   and g x y =
-    do let l = max x y
-           r = min x y
-           p = (l, r)
+    do let !l = max x y
+           !r = min x y
+           !p = (l, r)
        gateMap <- readIORef (andMap g)
        case Bimap.lookupR p gateMap of
          Nothing ->
            do v <- newVar g
               writeIORef (andMap g) $ Bimap.insert v p gateMap
-              let !lit = varToLit v
-              return lit
-         Just v ->
-           let !lit = varToLit v in
-           return lit
+              return $! varToLit v
+         Just v -> return $! varToLit v

   inputCount g = length <$> readIORef (inputs g)

Unfortunately, that still isn't enough to stop the program from consuming very large amount of memory. I also tried applying some optimizations to the way we are storing these AIGER files in memory:

```diff diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs index 0daff3c..3e8f407 100644 --- a/src/Data/AIG/CompactGraph.hs +++ b/src/Data/AIG/CompactGraph.hs @@ -20,6 +20,9 @@ module Data.AIG.CompactGraph ( CompactGraph , CompactLit , CompactNetwork + , CompactAndGateInputs + , mkCompactAndGateInputs + , getCompactAndGateInputs , compactProxy , newCompactGraph ) where @@ -32,15 +35,15 @@ import qualified Data.Attoparsec.Zepto as Zepto import Data.Bits (shiftL, shiftR, (.&.), (.|.), xor, testBit) import Data.IORef (IORef, newIORef, modifyIORef', readIORef, writeIORef, atomicModifyIORef') import Data.List (elemIndex, intersperse) -import Data.Bimap (Bimap) -import Data.Map (Map) -import qualified Data.Bimap as Bimap +import Data.IntMap.Strict (IntMap) +import Data.Map.Strict (Map) +import qualified Data.IntMap.Strict as IntMap import qualified Data.Map.Strict as Map import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BSC8 import qualified Data.ByteString.Builder as BS import qualified Data.ByteString.Unsafe as BSU -import Data.Word (Word8, Word32) +import Data.Word (Word8, Word32, Word64) import System.IO (Handle, withFile, IOMode(..)) import Data.AIG.Interface hiding (xor) @@ -53,6 +56,14 @@ newtype Var = Var Word32 nextVar :: Var -> Var nextVar (Var v) = Var (v + 1) +-- | TODO RGS: Docs +internalVarFromInt :: Int -> Var +internalVarFromInt v = Var (fromIntegral v) + +-- | TODO RGS: Docs +internalVarToInt :: Var -> Int +internalVarToInt (Var v) = fromIntegral v + ------------------------------------------------------------------ -- | A compact "Graph" datastructure similar to the AIGER format. data CompactGraph s = @@ -61,8 +72,11 @@ data CompactGraph s = -- ^ The largest variable ID used so far. , inputs :: IORef [Var] -- ^ Inputs, in reverse order! - , andMap :: IORef (Bimap Var (CompactLit s, CompactLit s)) - -- ^ A map from and gate variables to their input literals. + , andGateVarsToInputsMap :: IORef (IntMap (CompactAndGateInputs s)) + -- ^ TODO RGS: Docs + -- ^ TODO RGS: Note invariants about these two IntMaps + , andGateInputsToVarsMap :: IORef (IntMap Var) + -- ^ TODO RGS: Docs } ------------------------------------------------------------------ @@ -78,6 +92,25 @@ type CompactNetwork s = Network CompactLit CompactGraph compactProxy :: Proxy CompactLit CompactGraph compactProxy = Proxy (\x -> x) +------------------------------------------------------------------ +-- | TODO RGS: Docs +newtype CompactAndGateInputs s = CompactAndGateInputs Word64 + deriving (Eq, Ord, Show) + +-- | TODO RGS: Docs +mkCompactAndGateInputs :: CompactLit s -> CompactLit s -> CompactAndGateInputs s +mkCompactAndGateInputs (CompactLit hi) (CompactLit lo) = + CompactAndGateInputs (fromIntegral hi `shiftL` 32 + fromIntegral lo) + +-- | TODO RGS: Docs +getCompactAndGateInputs :: CompactAndGateInputs s -> (CompactLit s, CompactLit s) +getCompactAndGateInputs (CompactAndGateInputs w) = + (CompactLit (fromIntegral (w `shiftR` 32)), CompactLit (fromIntegral w)) + +-- | TODO RGS: Docs +internalCompactAndGateInputsToInt :: CompactAndGateInputs s -> Int +internalCompactAndGateInputsToInt (CompactAndGateInputs w) = fromIntegral w + ------------------------------------------------------------------ -- Core data structure operations @@ -101,9 +134,10 @@ copySign (CompactLit src) (CompactLit dst) = newCompactGraph :: IO (CompactGraph s) newCompactGraph = - do maxVar <- newIORef (Var 0) - inputs <- newIORef [] - andMap <- newIORef Bimap.empty + do maxVar <- newIORef (Var 0) + inputs <- newIORef [] + andGateVarsToInputsMap <- newIORef IntMap.empty + andGateInputsToVarsMap <- newIORef IntMap.empty return (CompactGraph {..}) newVar :: CompactGraph s -> IO Var @@ -120,12 +154,12 @@ newVar g = -- binary AIGER file format. mkVarMap :: [Var] -> - Bimap Var (CompactLit s, CompactLit s) -> + IntMap (CompactAndGateInputs s) -> (Map Var Var) mkVarMap ins gateMap = Map.fromList (zip varList [Var 0..]) where - varList = [Var 0] ++ ins ++ Bimap.keys gateMap + varList = [Var 0] ++ ins ++ map internalVarFromInt (IntMap.keys gateMap) -- | Adjust a literal according to the given variable mapping. lookupLit :: CompactLit s -> Map Var Var -> Maybe (CompactLit s) @@ -162,15 +196,15 @@ writeHeader :: [Var] -> Int -> [CompactLit s] -> - Bimap Var (CompactLit s, CompactLit s) -> + IntMap (CompactAndGateInputs s) -> IO () -writeHeader h format (Var var) ins latches outs gateMap = +writeHeader h format (Var var) ins latches outs varsToInputsMap = do hPutBBLn h $ bsUnwords [ BS.byteString (modeString format) , BS.word32Dec var , BS.intDec (length ins) , BS.intDec latches , BS.intDec (length outs) - , BS.intDec (Bimap.size gateMap) + , BS.intDec (IntMap.size varsToInputsMap) ] -- | Write AIGER input lines to the given handle. @@ -223,11 +257,12 @@ writeAnds :: Handle -> AIGFileMode -> Map Var Var -> - Bimap Var (CompactLit s, CompactLit s) -> + IntMap (CompactAndGateInputs s) -> IO () writeAnds h format varMap gateMap = - forM_ (Bimap.assocs gateMap) $ \(v, (l, r)) -> - case (varToLit <$> Map.lookup v varMap + forM_ (IntMap.assocs gateMap) $ \(v, inputs) -> + let (l, r) = getCompactAndGateInputs inputs in + case (varToLit <$> Map.lookup (internalVarFromInt v) varMap , lookupLit l varMap , lookupLit r varMap) of (Just vi, Just li, Just ri) -> @@ -349,7 +384,12 @@ getDifference = go 0 0 getDifferences :: Parser (Word32, Word32) getDifferences = (,) <$> getDifference <*> getDifference -getGraph :: Parser (Var, [Var], [CompactLit s], Bimap Var (CompactLit s, CompactLit s)) +getGraph :: Parser ( Var + , [Var] + , [CompactLit s] + , IntMap (CompactAndGateInputs s) + , IntMap Var + ) getGraph = do (maxvar, ninputs, nlatches, nouts, nands) <- getHeader outputs <- replicateM (nlatches + nouts) getOutput @@ -357,15 +397,23 @@ getGraph = let maxInput = fromIntegral ninputs inputs = [Var 1 .. Var maxInput] andVars = take (length diffPairs) [(maxInput + 1) ..] - addDiff v (ld, rd) = (Var v, (CompactLit l, CompactLit r)) + addDiff v (ld, rd) = (Var v, inps) where l = (v `shiftL` 1) - ld r = l - rd + inps = mkCompactAndGateInputs (CompactLit l) (CompactLit r) andAssocs = zipWith addDiff andVars diffPairs return ( Var (fromIntegral maxvar) , reverse inputs , outputs - , Bimap.fromList andAssocs + , IntMap.fromList $ + map (\(v, inps) -> + (internalVarToInt v, inps)) + andAssocs + , IntMap.fromList $ + map (\(v, inps) -> + (internalCompactAndGateInputsToInt inps, v)) + andAssocs ) abstractEval :: @@ -410,10 +458,11 @@ instance IsAIG CompactLit CompactGraph where aigerNetwork _proxy fp = do -- See Note [Parsing the AIGER format] res <- Zepto.parse getGraph <$> BS.readFile fp - (maxv, inps, outs, gates) <- either fail pure res - maxVar <- newIORef maxv - inputs <- newIORef inps - andMap <- newIORef gates + (maxv, inps, outs, varsToInputs, inputsToVars) <- either fail pure res + maxVar <- newIORef maxv + inputs <- newIORef inps + andGateVarsToInputsMap <- newIORef varsToInputs + andGateInputsToVarsMap <- newIORef inputsToVars return (Network CompactGraph {..} outs) trueLit _g = CompactLit 1 @@ -427,12 +476,16 @@ instance IsAIG CompactLit CompactGraph where and g x y = do let !l = max x y !r = min x y - !p = (l, r) - gateMap <- readIORef (andMap g) - case Bimap.lookupR p gateMap of + !inputs = mkCompactAndGateInputs l r + !inputsInt = internalCompactAndGateInputsToInt inputs + inputsToVarsMap <- readIORef (andGateInputsToVarsMap g) + case IntMap.lookup inputsInt inputsToVarsMap of Nothing -> do v <- newVar g - writeIORef (andMap g) $ Bimap.insert v p gateMap + modifyIORef' (andGateVarsToInputsMap g) $ + IntMap.insert (internalVarToInt v) inputs + writeIORef (andGateInputsToVarsMap g) $ + IntMap.insert inputsInt v inputsToVarsMap return $! varToLit v Just v -> return $! varToLit v @@ -447,23 +500,23 @@ instance IsAIG CompactLit CompactGraph where withFile fp WriteMode $ \h -> do var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g) - gateMap <- readIORef (andMap g) - let vm = mkVarMap ins gateMap + varsToInputsMap <- readIORef (andGateVarsToInputsMap g) + let vm = mkVarMap ins varsToInputsMap format = Binary - writeHeader h format var ins latches outs gateMap + writeHeader h format var ins latches outs varsToInputsMap writeInputs h format latches vm ins writeLatches h format latches vm ins outs writeOutputs h latches vm outs - writeAnds h format vm gateMap + writeAnds h format vm varsToInputsMap writeCNF g out fp = withFile fp WriteMode $ \h -> do Var var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g) - gateMap <- readIORef (andMap g) - let vm = mkVarMap ins gateMap + varsToInputsMap <- readIORef (andGateVarsToInputsMap g) + let vm = mkVarMap ins varsToInputsMap nvars = fromIntegral var + 1 - nclauses = (3 * Bimap.size gateMap) + 2 + nclauses = (3 * IntMap.size varsToInputsMap) + 2 litToCNF lit = case Map.lookup (litToVar lit) vm of Just (Var v) -> @@ -476,8 +529,9 @@ instance IsAIG CompactLit CompactGraph where , BS.intDec nvars , BS.intDec nclauses ] - forM_ (Bimap.assocs gateMap) $ \(v, (ll, rl)) -> - do n <- litToCNF (varToLit v) + forM_ (IntMap.assocs varsToInputsMap) $ \(v, inputs) -> + do let (ll, rl) = getCompactAndGateInputs inputs + n <- litToCNF (varToLit (internalVarFromInt v)) li <- litToCNF ll ri <- litToCNF rl -- 3 clauses for each gate @@ -515,9 +569,9 @@ instance IsAIG CompactLit CompactGraph where -- necessary. litView g l = do ins <- reverse <$> readIORef (inputs g) - gateMap <- readIORef (andMap g) + varsToInputsMap <- readIORef (andGateVarsToInputsMap g) let v = litToVar l - case (elemIndex v ins, Bimap.lookup v gateMap, litNegated l) of + case (elemIndex v ins, getCompactAndGateInputs <$> IntMap.lookup (internalVarToInt v) varsToInputsMap, litNegated l) of (Just i, _, False) -> return (Input i) (Just i, _, True) -> return (NotInput i) (_, Just (l1, l2), False) -> return (And l1 l2) ```

But that still does not help things—I still run out of memory before write_aig can finish.

I'm afraid we are running up against a much more difficult-to-solve issue: the Dilithium Cryptol specs are quite large, and encoding them with write_aig would require a very large AIGER file. (It wouldn't surprise me if it would take several gigabytes.) The current approach that the aig library takes is to first construct a data structure representing the graph in memory, and then aig will write it to an AIGER file. The problem is that if the AIGER file is massize, then the in-memory representation of the graph in Haskell will be even more massive—and in this case, it is massive enough to use up all of your RAM.

I think the only reasonable way to make this work would be to use some kind of on-demand streaming API for gradually writing the contents of an AIG to a file using constant memory. I belive this should be quite doable, as the AIGER file format puts every input, output, latch, and AND-gate on its own line, meaning that it ought to be possible to write the file one line at a time without needing to keep the entire graph in memory. And all of the AIG-related operations are exposed via the IsAIG type class, so we could offer an alternative, streaming API by simply defining a separate IsAIG instance.

weaversa commented 1 year ago

@alanminko and I (mostly Alan) came up with a way to represent hierarchical AIGs using the current AIGER format by introducing barrier buffers in a non-complicated way. This would drastically cut down on the size of your AIGs because each function could only need to be represented once. Feel free to reach out.

On Mon, Mar 27, 2023 at 7:33 PM Ryan Scott @.***> wrote:

The following patch to aig suffices to fix the space leaks that appear in info table profiling reports:

diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs index ecc075e..0daff3c 100644--- a/src/Data/AIG/CompactGraph.hs+++ b/src/Data/AIG/CompactGraph.hs@@ -421,19 +425,16 @@ instance IsAIG CompactLit CompactGraph where return (varToLit v)

and g x y =- do let l = max x y- r = min x y- p = (l, r)+ do let !l = max x y+ !r = min x y+ !p = (l, r) gateMap <- readIORef (andMap g) case Bimap.lookupR p gateMap of Nothing -> do v <- newVar g writeIORef (andMap g) $ Bimap.insert v p gateMap- let !lit = varToLit v- return lit- Just v ->- let !lit = varToLit v in- return lit+ return $! varToLit v+ Just v -> return $! varToLit v

inputCount g = length <$> readIORef (inputs g)

Unfortunately, that still isn't enough to stop the program from consuming very large amount of memory. I also tried applying some optimizations to the way we are storing these AIGER files in memory:

diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs index 0daff3c..3e8f407 100644--- a/src/Data/AIG/CompactGraph.hs+++ b/src/Data/AIG/CompactGraph.hs@@ -20,6 +20,9 @@ module Data.AIG.CompactGraph ( CompactGraph , CompactLit , CompactNetwork+ , CompactAndGateInputs+ , mkCompactAndGateInputs+ , getCompactAndGateInputs , compactProxy , newCompactGraph ) where@@ -32,15 +35,15 @@ import qualified Data.Attoparsec.Zepto as Zepto import Data.Bits (shiftL, shiftR, (.&.), (.|.), xor, testBit) import Data.IORef (IORef, newIORef, modifyIORef', readIORef, writeIORef, atomicModifyIORef') import Data.List (elemIndex, intersperse)-import Data.Bimap (Bimap)-import Data.Map (Map)-import qualified Data.Bimap as Bimap+import Data.IntMap.Strict (IntMap)+import Data.Map.Strict (Map)+import qualified Data.IntMap.Strict as IntMap import qualified Data.Map.Strict as Map import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BSC8 import qualified Data.ByteString.Builder as BS import qualified Data.ByteString.Unsafe as BSU-import Data.Word (Word8, Word32)+import Data.Word (Word8, Word32, Word64) import System.IO (Handle, withFile, IOMode(..))

import Data.AIG.Interface hiding (xor)@@ -53,6 +56,14 @@ newtype Var = Var Word32 nextVar :: Var -> Var nextVar (Var v) = Var (v + 1) +-- | TODO RGS: Docs+internalVarFromInt :: Int -> Var+internalVarFromInt v = Var (fromIntegral v)++-- | TODO RGS: Docs+internalVarToInt :: Var -> Int+internalVarToInt (Var v) = fromIntegral v+

-- | A compact "Graph" datastructure similar to the AIGER format. data CompactGraph s =@@ -61,8 +72,11 @@ data CompactGraph s = -- ^ The largest variable ID used so far. , inputs :: IORef [Var] -- ^ Inputs, in reverse order!- , andMap :: IORef (Bimap Var (CompactLit s, CompactLit s))- -- ^ A map from and gate variables to their input literals.+ , andGateVarsToInputsMap :: IORef (IntMap (CompactAndGateInputs s))+ -- ^ TODO RGS: Docs+ -- ^ TODO RGS: Note invariants about these two IntMaps+ , andGateInputsToVarsMap :: IORef (IntMap Var)+ -- ^ TODO RGS: Docs }

------------------------------------------------------------------@@ -78,6 +92,25 @@ type CompactNetwork s = Network CompactLit CompactGraph compactProxy :: Proxy CompactLit CompactGraph compactProxy = Proxy (\x -> x) +------------------------------------------------------------------+-- | TODO RGS: Docs+newtype CompactAndGateInputs s = CompactAndGateInputs Word64+ deriving (Eq, Ord, Show)++-- | TODO RGS: Docs+mkCompactAndGateInputs :: CompactLit s -> CompactLit s -> CompactAndGateInputs s+mkCompactAndGateInputs (CompactLit hi) (CompactLit lo) =+ CompactAndGateInputs (fromIntegral hi shiftL 32 + fromIntegral lo)++-- | TODO RGS: Docs+getCompactAndGateInputs :: CompactAndGateInputs s -> (CompactLit s, CompactLit s)+getCompactAndGateInputs (CompactAndGateInputs w) =+ (CompactLit (fromIntegral (w shiftR 32)), CompactLit (fromIntegral w))++-- | TODO RGS: Docs+internalCompactAndGateInputsToInt :: CompactAndGateInputs s -> Int+internalCompactAndGateInputsToInt (CompactAndGateInputs w) = fromIntegral w+

-- Core data structure operations @@ -101,9 +134,10 @@ copySign (CompactLit src) (CompactLit dst) =

newCompactGraph :: IO (CompactGraph s) newCompactGraph =- do maxVar <- newIORef (Var 0)- inputs <- newIORef []- andMap <- newIORef Bimap.empty+ do maxVar <- newIORef (Var 0)+ inputs <- newIORef []+ andGateVarsToInputsMap <- newIORef IntMap.empty+ andGateInputsToVarsMap <- newIORef IntMap.empty return (CompactGraph {..})

newVar :: CompactGraph s -> IO Var@@ -120,12 +154,12 @@ newVar g = -- binary AIGER file format. mkVarMap :: [Var] ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> (Map Var Var) mkVarMap ins gateMap = Map.fromList (zip varList [Var 0..]) where- varList = [Var 0] ++ ins ++ Bimap.keys gateMap+ varList = [Var 0] ++ ins ++ map internalVarFromInt (IntMap.keys gateMap)

-- | Adjust a literal according to the given variable mapping. lookupLit :: CompactLit s -> Map Var Var -> Maybe (CompactLit s)@@ -162,15 +196,15 @@ writeHeader :: [Var] -> Int -> [CompactLit s] ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> IO ()-writeHeader h format (Var var) ins latches outs gateMap =+writeHeader h format (Var var) ins latches outs varsToInputsMap = do hPutBBLn h $ bsUnwords [ BS.byteString (modeString format) , BS.word32Dec var , BS.intDec (length ins) , BS.intDec latches , BS.intDec (length outs)- , BS.intDec (Bimap.size gateMap)+ , BS.intDec (IntMap.size varsToInputsMap) ]

-- | Write AIGER input lines to the given handle.@@ -223,11 +257,12 @@ writeAnds :: Handle -> AIGFileMode -> Map Var Var ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> IO () writeAnds h format varMap gateMap =- forM (Bimap.assocs gateMap) $ (v, (l, r)) ->- case (varToLit <$> Map.lookup v varMap+ forM (IntMap.assocs gateMap) $ (v, inputs) ->+ let (l, r) = getCompactAndGateInputs inputs in+ case (varToLit <$> Map.lookup (internalVarFromInt v) varMap , lookupLit l varMap , lookupLit r varMap) of (Just vi, Just li, Just ri) ->@@ -349,7 +384,12 @@ getDifference = go 0 0 getDifferences :: Parser (Word32, Word32) getDifferences = (,) <$> getDifference <*> getDifference -getGraph :: Parser (Var, [Var], [CompactLit s], Bimap Var (CompactLit s, CompactLit s))+getGraph :: Parser ( Var+ , [Var]+ , [CompactLit s]+ , IntMap (CompactAndGateInputs s)+ , IntMap Var+ ) getGraph = do (maxvar, ninputs, nlatches, nouts, nands) <- getHeader outputs <- replicateM (nlatches + nouts) getOutput@@ -357,15 +397,23 @@ getGraph = let maxInput = fromIntegral ninputs inputs = [Var 1 .. Var maxInput] andVars = take (length diffPairs) [(maxInput + 1) ..]- addDiff v (ld, rd) = (Var v, (CompactLit l, CompactLit r))+ addDiff v (ld, rd) = (Var v, inps) where l = (v shiftL 1) - ld r = l - rd+ inps = mkCompactAndGateInputs (CompactLit l) (CompactLit r) andAssocs = zipWith addDiff andVars diffPairs return ( Var (fromIntegral maxvar) , reverse inputs , outputs- , Bimap.fromList andAssocs+ , IntMap.fromList $+ map ((v, inps) ->+ (internalVarToInt v, inps))+ andAssocs+ , IntMap.fromList $+ map ((v, inps) ->+ (internalCompactAndGateInputsToInt inps, v))+ andAssocs )

abstractEval ::@@ -410,10 +458,11 @@ instance IsAIG CompactLit CompactGraph where aigerNetwork _proxy fp = do -- See Note [Parsing the AIGER format] res <- Zepto.parse getGraph <$> BS.readFile fp- (maxv, inps, outs, gates) <- either fail pure res- maxVar <- newIORef maxv- inputs <- newIORef inps- andMap <- newIORef gates+ (maxv, inps, outs, varsToInputs, inputsToVars) <- either fail pure res+ maxVar <- newIORef maxv+ inputs <- newIORef inps+ andGateVarsToInputsMap <- newIORef varsToInputs+ andGateInputsToVarsMap <- newIORef inputsToVars return (Network CompactGraph {..} outs)

trueLit _g = CompactLit 1@@ -427,12 +476,16 @@ instance IsAIG CompactLit CompactGraph where and g x y = do let !l = max x y !r = min x y- !p = (l, r)- gateMap <- readIORef (andMap g)- case Bimap.lookupR p gateMap of+ !inputs = mkCompactAndGateInputs l r+ !inputsInt = internalCompactAndGateInputsToInt inputs+ inputsToVarsMap <- readIORef (andGateInputsToVarsMap g)+ case IntMap.lookup inputsInt inputsToVarsMap of Nothing -> do v <- newVar g- writeIORef (andMap g) $ Bimap.insert v p gateMap+ modifyIORef' (andGateVarsToInputsMap g) $+ IntMap.insert (internalVarToInt v) inputs+ writeIORef (andGateInputsToVarsMap g) $+ IntMap.insert inputsInt v inputsToVarsMap return $! varToLit v Just v -> return $! varToLit v @@ -447,23 +500,23 @@ instance IsAIG CompactLit CompactGraph where withFile fp WriteMode $ \h -> do var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)- let vm = mkVarMap ins gateMap+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g)+ let vm = mkVarMap ins varsToInputsMap format = Binary- writeHeader h format var ins latches outs gateMap+ writeHeader h format var ins latches outs varsToInputsMap writeInputs h format latches vm ins writeLatches h format latches vm ins outs writeOutputs h latches vm outs- writeAnds h format vm gateMap+ writeAnds h format vm varsToInputsMap

writeCNF g out fp = withFile fp WriteMode $ \h -> do Var var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)- let vm = mkVarMap ins gateMap+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g)+ let vm = mkVarMap ins varsToInputsMap nvars = fromIntegral var + 1- nclauses = (3 Bimap.size gateMap) + 2+ nclauses = (3 IntMap.size varsToInputsMap) + 2 litToCNF lit = case Map.lookup (litToVar lit) vm of Just (Var v) ->@@ -476,8 +529,9 @@ instance IsAIG CompactLit CompactGraph where , BS.intDec nvars , BS.intDec nclauses ]- forM (Bimap.assocs gateMap) $ (v, (ll, rl)) ->- do n <- litToCNF (varToLit v)+ forM (IntMap.assocs varsToInputsMap) $ (v, inputs) ->+ do let (ll, rl) = getCompactAndGateInputs inputs+ n <- litToCNF (varToLit (internalVarFromInt v)) li <- litToCNF ll ri <- litToCNF rl -- 3 clauses for each gate@@ -515,9 +569,9 @@ instance IsAIG CompactLit CompactGraph where -- necessary. litView g l = do ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g) let v = litToVar l- case (elemIndex v ins, Bimap.lookup v gateMap, litNegated l) of+ case (elemIndex v ins, getCompactAndGateInputs <$> IntMap.lookup (internalVarToInt v) varsToInputsMap, litNegated l) of (Just i, , False) -> return (Input i) (Just i, , True) -> return (NotInput i) (_, Just (l1, l2), False) -> return (And l1 l2)

But that still does not help things—I still run out of memory before write_aig can finish.

I'm afraid we are running up against a much more difficult-to-solve issue: the Dilithium Cryptol specs are quite large, and encoding them with write_aig would require a very large AIGER file. (It wouldn't surprise me if it would take several gigabytes.) The current approach that the aig library takes is to first construct a data structure representing the graph in memory, and then aig will write it to an AIGER file. The problem is that if the AIGER file is massize, then the in-memory representation of the graph in Haskell will be even more massive—and in this case, it is massive enough to use up all of your RAM.

I think the only reasonable way to make this work would be to use some kind of on-demand streaming API for gradually writing the contents of an AIG to a file using constant memory. I belive this should be quite doable, as the AIGER file format puts every input, output, latch, and AND-gate on its own line, meaning that it ought to be possible to write the file one line at a time without needing to keep the entire graph in memory. And all of the AIG-related operations are exposed via the IsAIG https://github.com/GaloisInc/aig/blob/31da58c944358bdd7267ed95d1ea85ee375ff491/src/Data/AIG/Interface.hs#L117 type class, so we could offer an alternative, streaming API by simply defining a separate IsAIG instance.

— Reply to this email directly, view it on GitHub https://github.com/GaloisInc/saw-script/issues/1826#issuecomment-1485997171, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYQSOXN6U3NN223ZEG3VCDW6IPUXANCNFSM6AAAAAAVLJX4TU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

alanminko commented 1 year ago

Indeed, there is a way to write a hierarchical design into a single AIGER file in such a way that all the hierarchy information is preserved, and at the same time there is no logic duplication, which is inevitable when bit-blasting a design with multiple instances of a module.

However, this new representation format has not been standardized or documented.  There are a few pieces of code, which I wrote in summer 2022 to experiment with the ideas, but the code is not stand-alone or fully functional (it should at least allow for accessing the hierarchy and collapsing some of it or all of it on demand).

Please let me know if having this "hierarchical AIG representation" would be helpful, and I will try to come up with a usable code in a few weeks.

It would also be helpful to know typical constraints:

On 3/28/2023 8:19 AM, weaversa wrote:

@alanminko and I (mostly Alan) came up with a way to represent hierarchical AIGs using the current AIGER format by introducing barrier buffers in a non-complicated way. This would drastically cut down on the size of your AIGs because each function could only need to be represented once. Feel free to reach out.

On Mon, Mar 27, 2023 at 7:33 PM Ryan Scott @.***> wrote:

The following patch to aig suffices to fix the space leaks that appear in info table profiling reports:

diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs index ecc075e..0daff3c 100644--- a/src/Data/AIG/CompactGraph.hs+++ b/src/Data/AIG/CompactGraph.hs@@ -421,19 +425,16 @@ instance IsAIG CompactLit CompactGraph where return (varToLit v)

and g x y =- do let l = max x y- r = min x y- p = (l, r)+ do let !l = max x y+ !r = min x y+ !p = (l, r) gateMap <- readIORef (andMap g) case Bimap.lookupR p gateMap of Nothing -> do v <- newVar g writeIORef (andMap g) $ Bimap.insert v p gateMap- let !lit = varToLit v- return lit- Just v ->- let !lit = varToLit v in- return lit+ return $! varToLit v+ Just v -> return $! varToLit v

inputCount g = length <$> readIORef (inputs g)

Unfortunately, that still isn't enough to stop the program from consuming very large amount of memory. I also tried applying some optimizations to the way we are storing these AIGER files in memory:

diff --git a/src/Data/AIG/CompactGraph.hs b/src/Data/AIG/CompactGraph.hs index 0daff3c..3e8f407 100644--- a/src/Data/AIG/CompactGraph.hs+++ b/src/Data/AIG/CompactGraph.hs@@ -20,6 +20,9 @@ module Data.AIG.CompactGraph ( CompactGraph , CompactLit , CompactNetwork+ , CompactAndGateInputs+ , mkCompactAndGateInputs+ , getCompactAndGateInputs , compactProxy , newCompactGraph ) where@@ -32,15 +35,15 @@ import qualified Data.Attoparsec.Zepto as Zepto import Data.Bits (shiftL, shiftR, (.&.), (.|.), xor, testBit) import Data.IORef (IORef, newIORef, modifyIORef', readIORef, writeIORef, atomicModifyIORef') import Data.List (elemIndex, intersperse)-import Data.Bimap (Bimap)-import Data.Map (Map)-import qualified Data.Bimap as Bimap+import Data.IntMap.Strict (IntMap)+import Data.Map.Strict (Map)+import qualified Data.IntMap.Strict as IntMap import qualified Data.Map.Strict as Map import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BSC8 import qualified Data.ByteString.Builder as BS import qualified Data.ByteString.Unsafe as BSU-import Data.Word (Word8, Word32)+import Data.Word (Word8, Word32, Word64) import System.IO (Handle, withFile, IOMode(..))

import Data.AIG.Interface hiding (xor)@@ -53,6 +56,14 @@ newtype Var = Var Word32 nextVar :: Var -> Var nextVar (Var v) = Var (v + 1) +-- | TODO RGS: Docs+internalVarFromInt :: Int -> Var+internalVarFromInt v = Var (fromIntegral v)++-- | TODO RGS: Docs+internalVarToInt :: Var -> Int+internalVarToInt (Var v) = fromIntegral v+

-- | A compact "Graph" datastructure similar to the AIGER format. data CompactGraph s =@@ -61,8 +72,11 @@ data CompactGraph s = -- ^ The largest variable ID used so far. , inputs :: IORef [Var] -- ^ Inputs, in reverse order!- , andMap :: IORef (Bimap Var (CompactLit s, CompactLit s))- -- ^ A map from and gate variables to their input literals.+ , andGateVarsToInputsMap :: IORef (IntMap (CompactAndGateInputs s))+ -- ^ TODO RGS: Docs+ -- ^ TODO RGS: Note invariants about these two IntMaps+ , andGateInputsToVarsMap :: IORef (IntMap Var)+ -- ^ TODO RGS: Docs }

------------------------------------------------------------------@@ -78,6 +92,25 @@ type CompactNetwork s = Network CompactLit CompactGraph compactProxy :: Proxy CompactLit CompactGraph compactProxy = Proxy (\x -> x)

+------------------------------------------------------------------+-- | TODO RGS: Docs+newtype CompactAndGateInputs s = CompactAndGateInputs Word64+ deriving (Eq, Ord, Show)++-- | TODO RGS: Docs+mkCompactAndGateInputs :: CompactLit s -> CompactLit s -> CompactAndGateInputs s+mkCompactAndGateInputs (CompactLit hi) (CompactLit lo) =+ CompactAndGateInputs (fromIntegral hi shiftL 32 + fromIntegral lo)++-- | TODO RGS: Docs+getCompactAndGateInputs :: CompactAndGateInputs s -> (CompactLit s, CompactLit s)+getCompactAndGateInputs (CompactAndGateInputs w) =+ (CompactLit (fromIntegral (w shiftR 32)), CompactLit (fromIntegral w))++-- | TODO RGS: Docs+internalCompactAndGateInputsToInt :: CompactAndGateInputs s -> Int+internalCompactAndGateInputsToInt (CompactAndGateInputs w) = fromIntegral w+

-- Core data structure operations @@ -101,9 +134,10 @@ copySign (CompactLit src) (CompactLit dst) =

newCompactGraph :: IO (CompactGraph s) newCompactGraph =- do maxVar <- newIORef (Var 0)- inputs <- newIORef []- andMap <- newIORef Bimap.empty+ do maxVar <- newIORef (Var 0)+ inputs <- newIORef []+ andGateVarsToInputsMap <- newIORef IntMap.empty+ andGateInputsToVarsMap <- newIORef IntMap.empty return (CompactGraph {..})

newVar :: CompactGraph s -> IO Var@@ -120,12 +154,12 @@ newVar g = -- binary AIGER file format. mkVarMap :: [Var] ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> (Map Var Var) mkVarMap ins gateMap = Map.fromList (zip varList [Var 0..]) where- varList = [Var 0] ++ ins ++ Bimap.keys gateMap+ varList = [Var 0] ++ ins ++ map internalVarFromInt (IntMap.keys gateMap)

-- | Adjust a literal according to the given variable mapping. lookupLit :: CompactLit s -> Map Var Var -> Maybe (CompactLit s)@@ -162,15 +196,15 @@ writeHeader :: [Var] -> Int -> [CompactLit s] ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> IO ()-writeHeader h format (Var var) ins latches outs gateMap =+writeHeader h format (Var var) ins latches outs varsToInputsMap = do hPutBBLn h $ bsUnwords [ BS.byteString (modeString format) , BS.word32Dec var , BS.intDec (length ins) , BS.intDec latches , BS.intDec (length outs)- , BS.intDec (Bimap.size gateMap)+ , BS.intDec (IntMap.size varsToInputsMap) ]

-- | Write AIGER input lines to the given handle.@@ -223,11 +257,12 @@ writeAnds :: Handle -> AIGFileMode -> Map Var Var ->- Bimap Var (CompactLit s, CompactLit s) ->+ IntMap (CompactAndGateInputs s) -> IO () writeAnds h format varMap gateMap =- forM (Bimap.assocs gateMap) $ (v, (l, r)) ->- case (varToLit <$> Map.lookup v varMap+ forM (IntMap.assocs gateMap) $ (v, inputs) ->+ let (l, r) = getCompactAndGateInputs inputs in+ case (varToLit <$> Map.lookup (internalVarFromInt v) varMap , lookupLit l varMap , lookupLit r varMap) of (Just vi, Just li, Just ri) ->@@ -349,7 +384,12 @@ getDifference = go 0 0 getDifferences :: Parser (Word32, Word32) getDifferences = (,) <$> getDifference <*> getDifference -getGraph :: Parser (Var, [Var], [CompactLit s], Bimap Var (CompactLit s, CompactLit s))+getGraph :: Parser ( Var+ , [Var]+ , [CompactLit s]+ , IntMap (CompactAndGateInputs s)+ , IntMap Var+ ) getGraph = do (maxvar, ninputs, nlatches, nouts, nands) <- getHeader outputs <- replicateM (nlatches + nouts) getOutput@@ -357,15 +397,23 @@ getGraph = let maxInput = fromIntegral ninputs inputs = [Var 1 .. Var maxInput] andVars = take (length diffPairs) [(maxInput + 1) ..]- addDiff v (ld, rd) = (Var v, (CompactLit l, CompactLit r))+ addDiff v (ld, rd) = (Var v, inps) where l = (v shiftL 1) - ld r = l - rd+ inps = mkCompactAndGateInputs (CompactLit l) (CompactLit r) andAssocs = zipWith addDiff andVars diffPairs return ( Var (fromIntegral maxvar) , reverse inputs , outputs- , Bimap.fromList andAssocs+ , IntMap.fromList $+ map ((v, inps) ->+ (internalVarToInt v, inps))+ andAssocs+ , IntMap.fromList $+ map ((v, inps) ->+ (internalCompactAndGateInputsToInt inps, v))+ andAssocs )

abstractEval ::@@ -410,10 +458,11 @@ instance IsAIG CompactLit CompactGraph where aigerNetwork _proxy fp = do -- See Note [Parsing the AIGER format] res <- Zepto.parse getGraph <$> BS.readFile fp- (maxv, inps, outs, gates) <- either fail pure res- maxVar <- newIORef maxv- inputs <- newIORef inps- andMap <- newIORef gates+ (maxv, inps, outs, varsToInputs, inputsToVars) <- either fail pure res+ maxVar <- newIORef maxv+ inputs <- newIORef inps+ andGateVarsToInputsMap <- newIORef varsToInputs+ andGateInputsToVarsMap <- newIORef inputsToVars return (Network CompactGraph {..} outs)

trueLit _g = CompactLit 1@@ -427,12 +476,16 @@ instance IsAIG CompactLit CompactGraph where and g x y = do let !l = max x y !r = min x y- !p = (l, r)- gateMap <- readIORef (andMap g)- case Bimap.lookupR p gateMap of+ !inputs = mkCompactAndGateInputs l r+ !inputsInt = internalCompactAndGateInputsToInt inputs+ inputsToVarsMap <- readIORef (andGateInputsToVarsMap g)+ case IntMap.lookup inputsInt inputsToVarsMap of Nothing -> do v <- newVar g- writeIORef (andMap g) $ Bimap.insert v p gateMap+ modifyIORef' (andGateVarsToInputsMap g) $+ IntMap.insert (internalVarToInt v) inputs+ writeIORef (andGateInputsToVarsMap g) $+ IntMap.insert inputsInt v inputsToVarsMap return $! varToLit v Just v -> return $! varToLit v @@ -447,23 +500,23 @@ instance IsAIG CompactLit CompactGraph where withFile fp WriteMode $ \h -> do var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)- let vm = mkVarMap ins gateMap+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g)+ let vm = mkVarMap ins varsToInputsMap format = Binary- writeHeader h format var ins latches outs gateMap+ writeHeader h format var ins latches outs varsToInputsMap writeInputs h format latches vm ins writeLatches h format latches vm ins outs writeOutputs h latches vm outs- writeAnds h format vm gateMap+ writeAnds h format vm varsToInputsMap

writeCNF g out fp = withFile fp WriteMode $ \h -> do Var var <- readIORef (maxVar g) ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)- let vm = mkVarMap ins gateMap+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g)+ let vm = mkVarMap ins varsToInputsMap nvars = fromIntegral var + 1- nclauses = (3 Bimap.size gateMap) + 2+ nclauses = (3 IntMap.size varsToInputsMap) + 2 litToCNF lit = case Map.lookup (litToVar lit) vm of Just (Var v) ->@@ -476,8 +529,9 @@ instance IsAIG CompactLit CompactGraph where , BS.intDec nvars , BS.intDec nclauses ]- forM (Bimap.assocs gateMap) $ (v, (ll, rl)) ->- do n <- litToCNF (varToLit v)+ forM (IntMap.assocs varsToInputsMap) $ (v, inputs) ->+ do let (ll, rl) = getCompactAndGateInputs inputs+ n <- litToCNF (varToLit (internalVarFromInt v)) li <- litToCNF ll ri <- litToCNF rl -- 3 clauses for each gate@@ -515,9 +569,9 @@ instance IsAIG CompactLit CompactGraph where -- necessary. litView g l = do ins <- reverse <$> readIORef (inputs g)- gateMap <- readIORef (andMap g)+ varsToInputsMap <- readIORef (andGateVarsToInputsMap g) let v = litToVar l- case (elemIndex v ins, Bimap.lookup v gateMap, litNegated l) of+ case (elemIndex v ins, getCompactAndGateInputs <$> IntMap.lookup (internalVarToInt v) varsToInputsMap, litNegated l) of (Just i, , False) -> return (Input i) (Just i, , True) -> return (NotInput i) (_, Just (l1, l2), False) -> return (And l1 l2)

But that still does not help things—I still run out of memory before write_aig can finish.

I'm afraid we are running up against a much more difficult-to-solve issue: the Dilithium Cryptol specs are quite large, and encoding them with write_aig would require a very large AIGER file. (It wouldn't surprise me if it would take several gigabytes.) The current approach that the aig library takes is to first construct a data structure representing the graph in memory, and then aig will write it to an AIGER file. The problem is that if the AIGER file is massize, then the in-memory representation of the graph in Haskell will be even more massive—and in this case, it is massive enough to use up all of your RAM.

I think the only reasonable way to make this work would be to use some kind of on-demand streaming API for gradually writing the contents of an AIG to a file using constant memory. I belive this should be quite doable, as the AIGER file format puts every input, output, latch, and AND-gate on its own line, meaning that it ought to be possible to write the file one line at a time without needing to keep the entire graph in memory. And all of the AIG-related operations are exposed via the IsAIG

https://github.com/GaloisInc/aig/blob/31da58c944358bdd7267ed95d1ea85ee375ff491/src/Data/AIG/Interface.hs#L117 type class, so we could offer an alternative, streaming API by simply defining a separate IsAIG instance.

— Reply to this email directly, view it on GitHub

https://github.com/GaloisInc/saw-script/issues/1826#issuecomment-1485997171, or unsubscribe

https://github.com/notifications/unsubscribe-auth/ABYQSOXN6U3NN223ZEG3VCDW6IPUXANCNFSM6AAAAAAVLJX4TU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/GaloisInc/saw-script/issues/1826#issuecomment-1486072909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI4DBXWZBAAPIDIXIPIJMDTW6I4DFANCNFSM6AAAAAAVLJX4TU. You are receiving this because you were mentioned.Message ID: @.***>

RyanGlScott commented 1 year ago

Thanks for pointing this out to me! If there is a more compact way to build the AIGER files that come out of SAW, I'm definitely interested. I'm unclear if this is sufficient to produce the Dilithium example above in a reasonable amount of time, but it seems worth exploring if nothing else.

I don't have precise answers to your questions, but I'm fairly certain that this Dilithium example will be mostly combinational, as SAW's write_aig command doesn't produce any latches.