Open olivierdeckers opened 4 years ago
Yikes! My sample of 100 runs got 67 failures, which is definitely not good...
So, every time a failure is encountered, the library "blames" a subset of the configuration, and knows that any future configuration involving that subset can be discarded. For example, if a failure occurs because variable A = 5
and variable B = 4
, and a constraint dictates A = B
, all future configurations involving A = 5, B = 4
are discarded.
Because we can't "inspect" functions*, we can't see the constraints in play, so we can't confidently discard every configuration in which A = B
. It means that we might end up doing a fair amount of unnecessary work, but it should guarantee that we never get a false negative.
Clearly, in this situation, our conservative strategy of eliminating candidate configurations hasn't been conservative enough! My guess is that the subset of configuration being "blamed" for a failure is too small - some variable is having an effect on a computation, but that isn't being recorded.
In theory, we can write a pretty concise property test: if I take any random problem, and solve it with and without shuffled input, the set of solutions should always be equivalent. However, I suspect this will turn out to be a bit of a Heisenbug...
I'll look into it this week, and hopefully get to the bottom of it. It definitely doesn't seem to be anything suspicious in your code (typically, a call to unsafeRead
is to blame), and I can hopefully make a much smaller test case. Thanks for reporting this!
One question occurs, though the answer should definitely be "yes": when it does succeed, do the solutions satisfy your constraints?
Sorry about this - I'll get back to you ASAP 😅
* While true, there are sneakier ways round this that I'm currently researching (https://github.com/conal/concat/)
Thanks for looking into it. There is absolutely no need to be sorry by the way, I'm having fun trying to understand how this all works, thanks for enabling that!
So far, the solutions that were found are indeed correct.
AFAICS, the problem is that the shuffling (performed by refine
) of the minor inputs is not done only once for each major input, but possibly multiple times for a same major input when a backtracking happens, but then when the shuffle happens to be different than the previous time, any combination recorded in the nogoods
set involving the minor inputs of that major input are no longer valid.
https://github.com/i-am-tom/holmes/blob/16e9acbdc9ac49974085f8dc639638420a54a5ac/src/Control/Monad/MoriarT.hs#L203-L210
Just popping in to say that this library is undergoing a major (albeit very slow) rewrite - I haven't forgotten this 😅
@ju1m What you say makes sense, though - I'll try to take a look this week (emigrated to Spain last Wednesday, so life is a little hectic!)
@i-am-tom, you should be able to observe that behavior as I did with a few traceM
.
Indeed, I've seen and read with enthusiasm https://github.com/i-am-tom/puzzled though I personnally would not have chosen to go the concat/categorifier way, but it's interesting that you try it. Enjoy the spanish country!
I've put my debugging code here: https://github.com/ju1m/holmes/commit/debfeac29f341d9c5c3ee5cf45b9975611d1c8a5
$ cabal run examples >shuffle-noguard.log
$ grep 'index=.*(20,' shuffle-noguard.log
index=[([(20,True)],[Water]),([(20,False)],[Air])]
index=[([(20,True)],[Water]),([(20,False)],[Air])]
index=[([(20,True)],[Water]),([(20,False)],[Air])]
index=[([(20,True)],[Water]),([(20,False)],[Air])]
index=[([(20,True)],[Air]),([(20,False)],[Water])]
See https://github.com/olivierdeckers/holmes/blob/aquarium-example/examples/Aquarium.hs for an example
when running
solve example4
, it sometimes returns the solution, and sometimes returnsNothing
.