Open jwaldmann opened 8 years ago
Changing Data.Set.Set to Data.Sequence.Seq in
class DIMACS t where
...
dimacsClauses :: t -> Seq IntSet
(Ersatz.Problem and Ersatz.Internal.Formula) already helps a bit (reduces 20 s for writing to 10 s)
In satchmo, I write asserts on the fly (to minisat-api), so the complete CNF is not actually in Haskell memory (It is in minisat's memory, of course).
DIMACS file format wants number of variables and clauses in the beginning of the file, and that makes it more difficult.
Writing the CNF still feels slow. I am counting 10 MByte per minute after the head of the formula is already written - so ersatz knows the number of variables and of clauses (so it knows all clauses at this point). Then what is it computing? Raw writing speed should be more like 10 MByte per second?
Here's the profiling data for above example
speed +RTS -P -H -RTS 1000000
total time = 42.78 secs (42782 ticks @ 1000 us, 1 processor)
total alloc = 31,984,151,680 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc ticks bytes
parseSolution5.vars Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:103:5-42 35.3 40.9 15117 13084496960
bClause.\ Ersatz.Problem src/Ersatz/Problem.hs:297:36-61 7.9 10.1 3395 3224431992
runBit Ersatz.Bit src/Ersatz/Bit.hs:(175,1)-(187,32) 7.0 5.9 2976 1887998112
parseSolution5.vlines Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:102:5-39 4.4 6.5 1888 2064150816
makeStableName' Ersatz.Internal.StableName src/Ersatz/Internal/StableName.hs:(20,1)-(23,35) 4.2 0.4 1809 127999944
runBit.\ Ersatz.Bit src/Ersatz/Bit.hs:(179,3)-(187,32) 3.7 4.1 1603 1303998696
insert'.go Data.HashMap.Internal Data/HashMap/Internal.hs:(787,5)-(816,76) 3.0 1.5 1270 490066008
formulaXor Ersatz.Internal.Formula src/Ersatz/Internal/Formula.hs:(211,1)-(217,11) 2.9 4.1 1252 1313248704
lookup# Data.HashMap.Internal Data/HashMap/Internal.hs:599:1-82 2.9 0.8 1250 253762960
bClause Ersatz.Problem src/Ersatz/Problem.hs:297:1-91 2.6 0.7 1102 207999808
assert Ersatz.Bit src/Ersatz/Bit.hs:(164,1)-(171,34) 2.3 2.0 972 631999368
parseSolution5 Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:(100,1)-(103,42) 2.1 3.6 892 1142799600
dimacs.clauses Ersatz.Problem src/Ersatz/Problem.hs:252:5-38 1.7 1.6 746 519999840
<> Ersatz.Internal.Formula src/Ersatz/Internal/Formula.hs:83:3-43 1.7 2.3 746 727992912
clone16 Data.HashMap.Internal Data/HashMap/Internal.hs:(2261,1)-(2262,19) 1.7 3.2 722 1037544304
cryptominisat5Path.\ Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:(88,47)-(97,31) 1.6 2.3 671 733872304
literally Ersatz.Bit src/Ersatz/Bit.hs:126:3-34 1.3 0.7 544 224000152
copy.\ Data.HashMap.Internal.Array Data/HashMap/Internal/Array.hs:(335,9)-(336,30) 1.2 1.4 530 446444656
lastAtom.\ Ersatz.Problem src/Ersatz/Problem.hs:103:39-68 0.9 1.1 371 351999824
satClauses Ersatz.Problem src/Ersatz/Problem.hs:350:1-55 0.8 1.2 328 367999520
and for double size
speed +RTS -P -H -M8G -RTS 2000000
total time = 88.01 secs (88010 ticks @ 1000 us, 1 processor)
total alloc = 64,605,103,944 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc ticks bytes
parseSolution5.vars Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:103:5-42 35.8 41.2 31482 26638630288
bClause.\ Ersatz.Problem src/Ersatz/Problem.hs:297:36-61 8.2 10.0 7210 6448901496
runBit Ersatz.Bit src/Ersatz/Bit.hs:(175,1)-(187,32) 6.5 5.8 5743 3775998112
parseSolution5.vlines Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:102:5-39 4.6 6.6 4079 4275706272
makeStableName' Ersatz.Internal.StableName src/Ersatz/Internal/StableName.hs:(20,1)-(23,35) 4.6 0.4 4024 255999944
insert'.go Data.HashMap.Internal Data/HashMap/Internal.hs:(787,5)-(816,76) 3.4 1.9 3032 1206018912
runBit.\ Ersatz.Bit src/Ersatz/Bit.hs:(179,3)-(187,32) 3.4 4.0 3005 2607998696
lookup# Data.HashMap.Internal Data/HashMap/Internal.hs:599:1-82 3.4 0.9 2974 604208496
formulaXor Ersatz.Internal.Formula src/Ersatz/Internal/Formula.hs:(211,1)-(217,11) 3.0 4.1 2615 2626498704
bClause Ersatz.Problem src/Ersatz/Problem.hs:297:1-91 2.7 0.6 2385 415999808
assert Ersatz.Bit src/Ersatz/Bit.hs:(164,1)-(171,34) 2.1 2.0 1832 1263999368
parseSolution5 Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:(100,1)-(103,42) 2.1 3.7 1824 2365598960
clone16 Data.HashMap.Internal Data/HashMap/Internal.hs:(2261,1)-(2262,19) 1.9 3.6 1651 2344719976
<> Ersatz.Internal.Formula src/Ersatz/Internal/Formula.hs:83:3-43 1.8 2.3 1554 1455993224
dimacs.clauses Ersatz.Problem src/Ersatz/Problem.hs:252:5-38 1.6 1.6 1365 1039999824
cryptominisat5Path.\ Ersatz.Solver.Minisat src/Ersatz/Solver/Minisat.hs:(88,47)-(97,31) 1.5 2.3 1304 1457643744
literally Ersatz.Bit src/Ersatz/Bit.hs:126:3-34 1.2 0.7 1098 448000152
satClauses Ersatz.Problem src/Ersatz/Problem.hs:350:1-55 0.8 1.1 727 735999640
lastAtom.\ Ersatz.Problem src/Ersatz/Problem.hs:103:39-68 0.8 1.1 717 703999824
The observation still holds: the solver (cryptominisat) takes a few seconds but writing the formula, and parsing the result, takes a minute!!
One aspect is that for this specific test, anyminisat was calling cryptominisat5 and that is inefficient at reading the assignment. I'll make a separate issue, and the fix should be easy - just apply the path that improve minisat's output parsing (attoparsec instead of Read?)
This still does not explain slow CNF writing.
Run the following (it creates and solves a formula that is not hard). Take the total time, and compare to actual solver time.
E.g., argument 100000, on my machine: total 40 seconds.Minisat starts only after 20 s, then takes < 1 s, then another 20 s are used for parsing the result (I'll open a separate issue)