LeventErkok / sbv

SMT Based Verification in Haskell. Express properties about Haskell programs and automatically prove them using SMT solvers.
https://github.com/LeventErkok/sbv
Other
243 stars 34 forks source link

Inputs refactored to IntMap #566

Closed doyougnu closed 3 years ago

doyougnu commented 4 years ago

Hi Levent,

This PR is to fix #562. In addition to that it does the following:

Added dependencies

Changes to cabal file

Reduced laziness

INLINE glue code

Test pass rate

After merging with master I get failures from fpRem so it's possible I've reintroduced a bug you fixed in commit aa25b5527687e13d5f3e0123c92d099602873f09. I did make sure commit aa25b5527687e13d5f3e0123c92d099602873f09 was applied so I am not sure what is happening here. Please take a look. Here is my latest run:

Running 1 test suites...
Test suite SBVTest: RUNNING...
SBVTest: Test platform: TestEnvCI CILinux
Remote
  SBVHeavyTests
    Basics.ArithSolver
      genFloats.arithmetic-binary_fpRem.0.5_-0.68302244:                                                                           FAIL
        SBVTestSuite/Utils/SBVTestFramework.hs:100:
        assertion-failure
      ...
18 out of 375853 tests failed (815.39s)
Test suite SBVTest: FAIL
...

Benchmarks

I wasn't expecting to observe any changes in the benchmarks but was pleasantly surprised:

default(0) is the benchmark run on master default(1) is this PR

The rightmost column gives the speedup statistic (time of master - pr-branch, ordered from faster to slower). So from these benchmarks we saw ~13% speedup for certain work loads and ~9% slowdown for some benchmarks. You'll notice that one benchmark Puzzles//Birthday became extremely worse (~1000% worse!, also this is a recurring problem with this benchmark, see #545). I am not sure why that is yet and I have to investigate the code path this benchmark takes. I think there are two likely possibilities: either this benchmark relied on some laziness that I accidentally removed, or the benchmark blew past the timeout limit and the benchmark itself is spurious.

Benchmark                                                  default(0)(μs) default(1) - default(0)(%)
---------------------------------------------------------- -------------- --------------------------
Miscellaneous//Idempotency.Cap                                   10358.04                     -13.23
Miscellaneous//RelativeComplements.UnitRight                     10705.77                     -12.93
Miscellaneous//Complement.Complement                             10380.16                     -12.90
Miscellaneous//InclusionIsPO                                     10258.45                     -12.66
Miscellaneous//Absorption.Cup                                    10927.50                     -12.50
Miscellaneous//Four                                              10998.05                     -12.36
Miscellaneous//RelativeComplements.UnitLeft                      10822.83                     -12.27
Miscellaneous//RelativeComplements.Identity                      10323.72                     -12.18
Miscellaneous//Domination.Cap                                    10880.07                     -12.15
Miscellaneous//RelativeComplements.ComplementIdentity            10824.76                     -12.02
Miscellaneous//DeMorgans.Cup                                     11006.95                     -11.98
Miscellaneous//JoinMeet.2                                        11071.08                     -11.94
Miscellaneous//SubsetChar.Implication                            10496.49                     -11.86
Miscellaneous//Intersection.Difference                           10334.11                     -11.84
Miscellaneous//SubsetEquality                                    11229.95                     -11.81
Miscellaneous//RelativeComplements.ComplementUnion               10985.01                     -11.79
Miscellaneous//JoinMeet.1                                        10836.59                     -11.68
Miscellaneous//Complement.Unique                                 11390.36                     -11.65
Miscellaneous//SubsetChar.Intersection                           11094.20                     -11.65
Miscellaneous//DistributionSubset.Intersection                   11642.91                     -11.63
Miscellaneous//SubsetChar.Union                                  11125.43                     -11.62
Miscellaneous//RelativeComplements.UnionUnion                    11061.71                     -11.57
Miscellaneous//Absorption.Cap                                    10885.76                     -11.55
Miscellaneous//RelativeComplements.InterInters.2                 10990.23                     -11.54
Miscellaneous//JoinMeet.3                                        10245.75                     -11.46
Miscellaneous//JoinMeet.4                                        10296.93                     -11.44
ProofTools//Strengthen.ex5                                       32002.86                     -11.13
Miscellaneous//RelativeComplements.CompFull                      10765.35                     -11.03
Uninterpreted//shannon2                                           3619.35                     -10.97
Miscellaneous//Identity.Union                                    10820.58                     -10.96
Miscellaneous//JoinMeet.5                                        11073.87                     -10.89
Miscellaneous//Idempotency.Cup                                   10204.51                     -10.86
Miscellaneous//SubsetEquality.Transitivity                       11030.44                     -10.80
Miscellaneous//MinE                                              12034.72                     -10.70
Miscellaneous//RelativeComplements.CompComp                      10786.45                     -10.68
Miscellaneous//DeMorgans.Cap                                     10895.53                     -10.66
Uninterpreted//shannon                                            3634.46                     -10.58
Miscellaneous/Birthday                                           12322.96                     -10.58
Miscellaneous//Domination.Cup                                    10796.59                     -10.54
Miscellaneous//Complement.Union                                  10384.86                     -10.43
Miscellaneous//Identity.Intersection                             10795.38                     -10.35
Uninterpreted//SFunArray                                          3576.98                     -10.35
Miscellaneous//DistributionSubset.Union                          11682.52                     -10.25
Miscellaneous//Complement.Intersection                           10396.03                     -10.21
Lists//GenFibs                                                 2239092.47                      -9.89
Miscellaneous//MaxE                                              12474.66                      -9.88
Miscellaneous//SubsetChar.Complement                             10971.15                      -9.84
Miscellaneous//RelativeComplements.Intersection                  10995.89                      -9.82
ProofTools//Strengthen.ex6                                       52964.06                      -9.75
Miscellaneous//Complement.Empty                                   2438.01                      -9.66
ProofTools/Sum.Correctness                                       37365.96                      -9.52
Optimizations/LinearOpt.problem                                   8481.95                      -9.48
Uninterpreted//noWiggle                                           3278.74                      -9.46
Miscellaneous//Commutivity.Union                                 10862.37                      -9.40
Uninterpreted//thmGood                                            3196.10                      -9.31
Miscellaneous//RelativeComplements.Union                         10990.57                      -9.22
Lists//CheckMutex.1                                              11988.16                      -9.16
Optimizations/Production.production                               8468.82                      -9.09
Miscellaneous//SoftConstrain                                     14006.37                      -8.97
Lists//NotFair.1                                                 13468.33                      -8.50
Optimizations//Enumerate.WeekendJustOver                         10396.35                      -8.35
Miscellaneous//Elts                                              11045.41                      -8.32
Uninterpreted//SArray                                             3673.26                      -8.30
Miscellaneous//Distributivity.Intersection                       11044.84                      -8.19
Miscellaneous//Associativity.Intersection                        10942.58                      -8.13
Uninterpreted//existsOk                                           2989.85                      -8.00
Miscellaneous//test.1                                             3559.43                      -8.00
Miscellaneous//Complement.Full                                    2413.13                      -7.86
Uninterpreted//Correctness                                        3899.46                      -7.83
Optimizations//Enumerate.firstWeekend                            10499.55                      -7.65
Miscellaneous/Problem                                            11569.21                      -7.57
Uninterpreted//test                                              12373.18                      -7.53
Uninterpreted//univOk                                             2968.56                      -7.46
ProofTools//BMC.ex1                                               8863.75                      -7.41
ProofTools/Fibonacci.Correctness                                 38467.04                      -7.39
ProofTools//Strengthen.ex3                                       21230.06                      -7.36
ProofTools//Strengthen.ex2                                       52756.38                      -7.22
Miscellaneous//RelativeComplements.UnionInters                   11058.96                      -7.16
Miscellaneous//RelativeComplements.InterInters.1                 11026.80                      -7.09
ProofTools//Strengthen.ex1                                       21305.53                      -6.77
Miscellaneous//Commutivity.Intersection                          10769.32                      -6.58
Optimizations//ExtField.problem                                   8330.09                      -6.40
ProofTools//Strengthen.ex4                                       31724.44                      -6.16
Miscellaneous//test.2                                             3381.32                      -6.13
Miscellaneous//Tuple                                             78930.39                      -6.07
Optimizations//Enumerate.AlmostWeekend                           10347.32                      -6.04
Miscellaneous//Distributivity.Union                              11096.11                      -5.85
Uninterpreted//r2                                                10843.01                      -5.81
Uninterpreted//genLs                                             12598.00                      -5.42
Uninterpreted//t1                                                10799.21                      -5.34
Uninterpreted//synthMul22                                        14252.20                      -5.15
Miscellaneous/genVals                                           128256.02                      -4.88
Lists//NotFair.3                                                 22950.68                      -4.55
Lists//CheckMutex.3                                              25985.63                      -4.53
Lists//CheckMutex.5                                             127573.26                      -4.19
Lists//NotFair.5                                                 87294.72                      -4.09
Miscellaneous//Associativity.Union                               11004.22                      -3.97
ProofTools//BMC.ex2                                              20642.33                      -2.56
Optimizations/VM.allocate                                        10131.90                      -1.74
Puzzles//DogCatMouse                                             11672.01                      -1.61
WeakestPreconditions//Correctness.Sum.badMeasure2                14330.86                      +0.36
WeakestPreconditions//ImperativeGCD                                100.83                      +0.75
BitPrecise//Fast-max                                              5528.25                      +1.06
WeakestPreconditions//Correctness.Basics y+1                      9914.30                      +1.24
Puzzles//Garden                                                 120131.89                      +1.31
WeakestPreconditions//Correctness.Basics skip-assign             10118.31                      +1.42
WeakestPreconditions//Correctness.Sum.alwaysFalseInvariant        9995.09                      +1.46
WeakestPreconditions//Correctness.Basics y is even               11040.25                      +1.70
Puzzles//LadyAndTigers                                           11238.04                      +2.45
BitPrecise//MultMask                                             64087.51                      +2.50
Puzzles//NQueens.NQueens 2                                        3313.96                      +2.58
BitPrecise//Fast-min                                              5520.14                      +2.69
Puzzles//NQueens.NQueens 3                                        4218.67                      +2.90
WeakestPreconditions//ImperativeFib                                 84.33                      +2.95
WeakestPreconditions//Correctness.Sum.alwaysTrueInvariant        10955.82                      +3.01
WeakestPreconditions//Correctness.Basics skip                     9426.89                      +3.27
WeakestPreconditions//Correctness.Sum.badMeasure1                14545.30                      +3.28
WeakestPreconditions//Correctness.Fib                            13734.91                      +3.29
WeakestPreconditions//Correctness.Basics x>0                      9941.50                      +3.29
Puzzles//NQueens.NQueens 1                                        2842.40                      +3.30
WeakestPreconditions//Correctness.GCD                            17038.91                      +3.31
WeakestPreconditions/Correctness.Length                          11721.48                      +3.41
WeakestPreconditions/Correctness.IntDiv                          14344.29                      +3.43
Puzzles//sudoku 6                                                70639.11                      +3.43
Puzzles//Coins                                                  121469.08                      +3.45
WeakestPreconditions/Correctness.IntSqrt                         14786.38                      +3.55
WeakestPreconditions/Correctness.Append                          14049.37                      +3.57
WeakestPreconditions//Correctness.Basics x>-5                     9699.20                      +3.85
Puzzles//sudoku 4                                                58118.10                      +3.93
WeakestPreconditions//Correctness.Sum.correctInvariant           13747.43                      +4.06
Puzzles//sudoku 0                                                71647.84                      +4.39
Puzzles//NQueens.NQueens 4                                        8189.91                      +4.42
WeakestPreconditions//Correctness.Basics y > x                    9519.96                      +4.62
WeakestPreconditions//Correctness.Sum.loopInvariant              14174.38                      +4.67
Puzzles//Puzzles.SendMoreMoney                                   21354.82                      +5.16
Puzzles//MagicSquare.magic 2                                      3469.13                      +5.32
Puzzles//sudoku 2                                                44259.43                      +5.41
Puzzles//sudoku 5                                                43686.05                      +5.88
Puzzles//MagicSquare.magic 3                                     31052.71                      +5.89
Puzzles//NQueens.NQueens 8                                      180954.11                      +6.03
Puzzles//sudoku 3                                                44659.93                      +6.12
Puzzles//sudoku 1                                                31617.65                      +7.12
Puzzles//NQueens.NQueens 6                                       18046.71                      +7.86
Puzzles//NQueens.NQueens 5                                       15588.87                      +8.22
Puzzles//NQueens.NQueens 7                                       59928.98                      +9.21
Puzzles//Birthday                                                12755.45                   +1341.65

The plan forward

Please review and let me know if you want any changes or fixes before merging. I'd be more than happy to do it.

Hope all is well!

Jeff

LeventErkok commented 3 years ago

Great!

Would be good to get to the bottom of the performance anomaly in the birthday puzzle. Just to see what's going on.

I'm not surprised the fpRem tests failed. You really need a very fresh z3 for that. (Like compiled from sources yesterday.)

Just to be clear: Do you want me to merge this branch? Or do you want to work some more on it before it gets merged? I think either is fine, let me know what your preference is.

doyougnu commented 3 years ago

Hey Levent, please go ahead and merge the branch, any further work on this PR would result in too many changes in a single patch to the code base.

LeventErkok commented 3 years ago

Sounds good. It'll probably take me a few days to get this done as I want to familiarize myself with all the changes. But overall it looks great.

LeventErkok commented 3 years ago

@doyougnu

I'm looking into this, and a little concerned about the test results. I'm guessing you did not run the doctests? I'm getting few failures there. In particular Documentation.SBV.Examples.Puzzles.Euler185 produces a different answer than what the master branch does; and so far as I know that puzzle has a unique solution.

If you get a chance do take a look and see why the current answer is different than the original, and see which one is actually correct. (I did the original many years ago, and I'm pretty sure I must've double checked the answer is correct. Maybe the solution isn't unique? I thought it was. Would be good to get to the bottom of this.)

Running doctests and making sure they pass would be nice. But I can do that as part of my review. If you can help with the Euler185 problem, that'd help.

LeventErkok commented 3 years ago

Also getting a failure on Documentation/SBV/Examples/Puzzles/U2Bridge.hs where the doctest is throwing an exception. That one looks even more mysterious.

doyougnu commented 3 years ago

@LeventErkok Ah you are completely right, I don't usually use doctests and forgot to run them on the branch. I'll look into these and report back.

LeventErkok commented 3 years ago

@doyougnu

Looking into this a bit more, I think the issue is that when the models are generated SBV relies on the fact that the values you put into that association list are in exactly the same order as the ones that the user declared. If you violate that, then some of the model extraction routines will simply fail. In particular, you have to be really careful about in which order the user specified the inputs. If you put them in an IntMap, you're losing that order again. (This is similar to mixing exists/forall order, but now appearing at the level of each class.)

Of course, SBV could've been architected so it doesn't really rely on that order; and if you use it in query mode it doesn't rely on that at all. But the "old" style of model construction is very sensitive to which order you put the resulting model into the association list.

One way to solve this would be to keep the original order separately and somehow manipulate the output to match that at the end. Another way might be to don't optimize the inputs into IntMaps at all, but keep your other optimizations in. I'm not sure where the biggest bang for the buck is.

LeventErkok commented 3 years ago

@doyougnu

I pushed all your changes + a few fixes I had to put in to get it to compile on my box into the a new branch: https://github.com/LeventErkok/sbv/tree/perfUpdate

Let's work off of that branch for the completion of this work.

doyougnu commented 3 years ago

Hey Levent,

I like this cautious approach because this PR is touching a lot of code.

think the issue is that when the models are generated SBV relies on the fact that the values you put into that association list are in exactly the same order as the ones that the user declared...If you put them in an IntMap, you're losing that order again.

I'm confused about this because I assumed that each value was associated with an unique NodeID and thus the order in the IntMap is the same order the values were created. I guess this assumption was wrong?

One way to solve this would be to keep the original order separately and somehow manipulate the output to match that at the end. Another way might be to don't optimize the inputs into IntMaps at all, but keep your other optimizations in. I'm not sure where the biggest bang for the buck is.

My take on this is that converting to an IntMap is paying down technical debt and has several advantages:

  1. Better time and memory performance
  2. It simplifies the code base because it maintains an ordering that we need to maintain
  3. I think the code base becomes easier to read because functions like reverse and the older types like IORef (([(Quantifier, NamedSymVar)], [NamedSymVar]), Set.Set String) are less readable (hence the comment that was next to it though!), i.e., they don't convey a lot of meaning.

So I think the correct action is to fix the architecture and technical debt because this is the root cause of these problems; even thought that is likely the hardest course of action. I think that because there is an invariant where order matters a list is simply the wrong data type and now that its been changed we have to fix all the code that implicitly assumes the ordering of the list. So if we tracked the original order separately we are really not getting the benefits of this change and adding more band aids over the root cause.

Could you point me to the right modules where the ordering is breaking down? I'll probably do a deep dive into Euler185 to explore it myself.

Also thanks for the help, thoughts, and work. It is a pleasure to contribute to sbv.

LeventErkok commented 3 years ago

I'm confused as to why the examples fail as well. As you correctly observed, the unique-id's should be in the order of free-variable creation; so the order should just be preserved. Maybe I misdiagnosed the issue.

I guess the right thing to do is to look at why Euler185 is currently failing. My initial hunch was that the order mattered, but IntMap should have the exact same order. So, let's get to the bottom of that.

If we do indeed find that the order is the issue, then I'm also OK with ripping out the functions that rely on that order. Now that we've the Query mode, they're less useful. But let's first really understand why Euler185 is failing. Have a look at that one, and I'll do the same over the weekend unless you beat me to it and find the culprit beforehand.

LeventErkok commented 3 years ago

@doyougnu

Hi Jeff. Looks like the issue stems from not respecting the unique-ids when you create the SMTModel value. I believe the issue is in this line:

https://github.com/LeventErkok/sbv/blob/7fb523f537bb89c12dfd03056d7ba5c7c6feef5a/Data/SBV/Control/Query.hs#L347

The type of assocs is:

https://github.com/LeventErkok/sbv/blob/7fb523f537bb89c12dfd03056d7ba5c7c6feef5a/Data/SBV/Core/Symbolic.hs#L2004

So, it's a map from String. When you render this to a list (by calling M.fromList on it), the result comes with the ascending order of the names. But we want this to be in the same order of the unique-node-id that was associated with that node.

I suppose you can fix this by keeping track of the unique-id in that map as well. That would change the underlying type of SMTModel, which would be a backwards-incompatible change, so would be nice to avoid it if you can. (But wouldn't be the end of the world either, if it's the easiest.)

doyougnu commented 3 years ago

Ah yes i think you have it.

I think the fix is to remove the String and change that map completely. It looks like getObservables type needs to change to return a map. I briefly looked at getObservables and its helper function and it seems that they throw away the NodeID and so the fix would be to preserve it in these functions.

Doing a cursory look at the code this would also mean changes to rObservables in the symbolic interpreter state from IORef [(String, CV -> Bool, SV)] to IORef (IntMap (String, CV -> Bool, SV)]) where the indices for the map would be the NodeIDs in the SV type. Its the same maneuver as the Inputs change. But by changing this type then getObservables would return an IntMap, and assocs would be an order preserving map merge which would also be a small performance win.

In general, I prefer to avoid breaking changes as much as possible, and I'm not sure if SMTModel will have to change I'll have to make alterations to getObservables first and find out. The good aspect of SMTModel's type is that if we have the right IntMap type then making the SMTModel type is pretty trivial.

I'm working on my PhD preliminary exam proposal at the moment so I'll be fairly tied up in the next few weeks but I'll try to tinker around with this tonight after I hit my progress goals for today.

LeventErkok commented 3 years ago

@doyougnu

Glad to hear! There's no hurry on this patch by any means. The perfUpdate branch can be merged whenever ready.

Best of luck with the PhD exams!