Closed doyougnu closed 3 years ago
Great!
Would be good to get to the bottom of the performance anomaly in the birthday puzzle. Just to see what's going on.
I'm not surprised the fpRem
tests failed. You really need a very fresh z3 for that. (Like compiled from sources yesterday.)
Just to be clear: Do you want me to merge this branch? Or do you want to work some more on it before it gets merged? I think either is fine, let me know what your preference is.
Hey Levent, please go ahead and merge the branch, any further work on this PR would result in too many changes in a single patch to the code base.
Sounds good. It'll probably take me a few days to get this done as I want to familiarize myself with all the changes. But overall it looks great.
@doyougnu
I'm looking into this, and a little concerned about the test results. I'm guessing you did not run the doctests? I'm getting few failures there. In particular Documentation.SBV.Examples.Puzzles.Euler185
produces a different answer than what the master branch does; and so far as I know that puzzle has a unique solution.
If you get a chance do take a look and see why the current answer is different than the original, and see which one is actually correct. (I did the original many years ago, and I'm pretty sure I must've double checked the answer is correct. Maybe the solution isn't unique? I thought it was. Would be good to get to the bottom of this.)
Running doctests and making sure they pass would be nice. But I can do that as part of my review. If you can help with the Euler185 problem, that'd help.
Also getting a failure on Documentation/SBV/Examples/Puzzles/U2Bridge.hs where the doctest is throwing an exception. That one looks even more mysterious.
@LeventErkok Ah you are completely right, I don't usually use doctests and forgot to run them on the branch. I'll look into these and report back.
@doyougnu
Looking into this a bit more, I think the issue is that when the models are generated SBV relies on the fact that the values you put into that association list are in exactly the same order as the ones that the user declared. If you violate that, then some of the model extraction routines will simply fail. In particular, you have to be really careful about in which order the user specified the inputs. If you put them in an IntMap, you're losing that order again. (This is similar to mixing exists/forall order, but now appearing at the level of each class.)
Of course, SBV could've been architected so it doesn't really rely on that order; and if you use it in query mode it doesn't rely on that at all. But the "old" style of model construction is very sensitive to which order you put the resulting model into the association list.
One way to solve this would be to keep the original order separately and somehow manipulate the output to match that at the end. Another way might be to don't optimize the inputs into IntMaps at all, but keep your other optimizations in. I'm not sure where the biggest bang for the buck is.
@doyougnu
I pushed all your changes + a few fixes I had to put in to get it to compile on my box into the a new branch: https://github.com/LeventErkok/sbv/tree/perfUpdate
Let's work off of that branch for the completion of this work.
Hey Levent,
I like this cautious approach because this PR is touching a lot of code.
think the issue is that when the models are generated SBV relies on the fact that the values you put into that association list are in exactly the same order as the ones that the user declared...If you put them in an IntMap, you're losing that order again.
I'm confused about this because I assumed that each value was associated with an unique NodeID
and thus the order in the IntMap
is the same order the values were created. I guess this assumption was wrong?
One way to solve this would be to keep the original order separately and somehow manipulate the output to match that at the end. Another way might be to don't optimize the inputs into IntMaps at all, but keep your other optimizations in. I'm not sure where the biggest bang for the buck is.
My take on this is that converting to an IntMap
is paying down technical debt and has several advantages:
reverse
and the older types like IORef (([(Quantifier, NamedSymVar)], [NamedSymVar]), Set.Set String)
are less readable (hence the comment that was next to it though!), i.e., they don't convey a lot of meaning.So I think the correct action is to fix the architecture and technical debt because this is the root cause of these problems; even thought that is likely the hardest course of action. I think that because there is an invariant where order matters a list is simply the wrong data type and now that its been changed we have to fix all the code that implicitly assumes the ordering of the list. So if we tracked the original order separately we are really not getting the benefits of this change and adding more band aids over the root cause.
Could you point me to the right modules where the ordering is breaking down? I'll probably do a deep dive into Euler185
to explore it myself.
Also thanks for the help, thoughts, and work. It is a pleasure to contribute to sbv
.
I'm confused as to why the examples fail as well. As you correctly observed, the unique-id's should be in the order of free-variable creation; so the order should just be preserved. Maybe I misdiagnosed the issue.
I guess the right thing to do is to look at why Euler185 is currently failing. My initial hunch was that the order mattered, but IntMap should have the exact same order. So, let's get to the bottom of that.
If we do indeed find that the order is the issue, then I'm also OK with ripping out the functions that rely on that order. Now that we've the Query mode, they're less useful. But let's first really understand why Euler185 is failing. Have a look at that one, and I'll do the same over the weekend unless you beat me to it and find the culprit beforehand.
@doyougnu
Hi Jeff. Looks like the issue stems from not respecting the unique-ids when you create the SMTModel
value. I believe the issue is in this line:
The type of assocs
is:
So, it's a map from String
. When you render this to a list (by calling M.fromList
on it), the result comes with the ascending order of the names. But we want this to be in the same order of the unique-node-id that was associated with that node.
I suppose you can fix this by keeping track of the unique-id in that map as well. That would change the underlying type of SMTModel
, which would be a backwards-incompatible change, so would be nice to avoid it if you can. (But wouldn't be the end of the world either, if it's the easiest.)
Ah yes i think you have it.
I think the fix is to remove the String
and change that map completely. It looks like getObservables
type needs to change to return a map. I briefly looked at getObservables
and its helper function and it seems that they throw away the NodeID
and so the fix would be to preserve it in these functions.
Doing a cursory look at the code this would also mean changes to rObservables
in the symbolic interpreter state from IORef [(String, CV -> Bool, SV)]
to IORef (IntMap (String, CV -> Bool, SV)])
where the indices for the map would be the NodeID
s in the SV
type. Its the same maneuver as the Inputs change. But by changing this type then getObservables
would return an IntMap
, and assocs
would be an order preserving map merge which would also be a small performance win.
In general, I prefer to avoid breaking changes as much as possible, and I'm not sure if SMTModel
will have to change I'll have to make alterations to getObservables
first and find out. The good aspect of SMTModel
's type is that if we have the right IntMap
type then making the SMTModel
type is pretty trivial.
I'm working on my PhD preliminary exam proposal at the moment so I'll be fairly tied up in the next few weeks but I'll try to tinker around with this tonight after I hit my progress goals for today.
@doyougnu
Glad to hear! There's no hurry on this patch by any means. The perfUpdate
branch can be merged whenever ready.
Best of luck with the PhD exams!
Hi Levent,
This PR is to fix #562. In addition to that it does the following:
Added dependencies
Changes to cabal file
SBVTest
and both benchmark targets with-threaded
Reduced laziness
BangPatterns
to any lazy accumulator I foundINLINE glue code
INLINE
annotations to functions that are heavily used throughout theCore
codebase but are likely to not be inlined by ghc. There is definitely more experimenting to be done aroundINLINE
vsINLINABLE
vsNOINLINE
pragmas, but this should wait until more of these older data structures are changed to something more efficient.Test pass rate
After merging with master I get failures from
fpRem
so it's possible I've reintroduced a bug you fixed in commit aa25b5527687e13d5f3e0123c92d099602873f09. I did make sure commit aa25b5527687e13d5f3e0123c92d099602873f09 was applied so I am not sure what is happening here. Please take a look. Here is my latest run:Benchmarks
I wasn't expecting to observe any changes in the benchmarks but was pleasantly surprised:
default(0)
is the benchmark run onmaster
default(1)
is thisPR
The rightmost column gives the
speedup
statistic (time ofmaster
-pr-branch
, ordered from faster to slower). So from these benchmarks we saw ~13% speedup for certain work loads and ~9% slowdown for some benchmarks. You'll notice that one benchmarkPuzzles//Birthday
became extremely worse (~1000% worse!, also this is a recurring problem with this benchmark, see #545). I am not sure why that is yet and I have to investigate the code path this benchmark takes. I think there are two likely possibilities: either this benchmark relied on some laziness that I accidentally removed, or the benchmark blew past the timeout limit and the benchmark itself is spurious.The plan forward
Text
instead ofString
typesPlease review and let me know if you want any changes or fixes before merging. I'd be more than happy to do it.
Hope all is well!
Jeff