Incorrect weighted model count because of learn-and-start-over?

latower commented 2 years ago

Dear ganak developers and maintainers,

I'm running some experiments on the wmc branch and there is one instance for which the weighted model count returned by ganak is twice that of the weighted model count returned by Cachet, ace (it's an encoding of a Bayesian Network), and my own version of weighted sharpSAT. To be specific: ganak returns 1.20932839536671822778, while the other solvers return 0.604664197683359113891.

I have always had the -noPMC and -noPCC flags, but I have played with different branching heuristics, with always the same result.

However: when I turn LSO off, the weighted model count returned by ganak matches the model count returned by the other solvers.

I'm guessing I can for now just work around this problem by disabling LSO for my experiments, but it would be nice to fix this, if it is indeed buggy behaviour, or otherwise just understand it better. Could you maybe help with this? Am I missing or misunderstanding something?

A few more thoughts on this problem:

I noticed that any variables for which the weight is unspecified, are interpreted by ganak as having weight 1 for each literal. I also noticed that free variables still contribute to the model count, such that for each free variable for which both literals have weight 1, the model count is multiplied with 2.

As far as I can tell, there are 4 free variables in the instance below:

Variable 259 with weight 0.12 is free
Variable 261 with weight 0.45 is free
Variable 263 with weight 0.28 is free
Variable 265 with weight 0.63 is free

Since none of them has weight 1, I'm guessing that maybe they are not the cause of the factor 2 discrepancy with the model count returned by the other solvers, but it must be one of the variables with no specified weight? I haven't seen this discrepancy in any of the other 50 instances that I generated.

Thanks for reading, and kind regards, anna

The problematic instance: input2.cnf.gz

msoos commented 2 years ago

Hi! Sorry for being so late :( Thanks for the super-detailed information! I will take a look in the next days and see what the problem may be. Your analysis is super-helpful, actually, thank you so much!

latower commented 2 years ago

Hi Mate,

Thank you for your response! I was wondering if there may be an update on this issue?

Kind regards, Anna

msoos commented 2 years ago

Hey @smsharma1 do you have an idea why this may be happening?

msoos commented 2 years ago

Now I feel like I should have been much more alert to this bug, thanks @latower There is also a bug in the master branch (should be renamed to main), counting non-weighted files. The attached file is miscounted. See: track3_012.cnf.zip

soos@tiresias:build$ /home/soos/development/sat_solvers/ganak/build/ganak -cs 4000 track3_012.cnf 
c Outputting solution to console
c GANAK version 1.0.0
c ganak GIT revision: 67d04a9bb499d0a70e0eba3b5d0313fbb6dba65e
c ganak build env: CMAKE_CXX_COMPILER = /usr/lib/ccache/bin/c++ | CMAKE_CXX_FLAGS =  -mtune=native -Wall -Wextra -Wunused -Wsign-compare -fno-omit-frame-pointer -Wtype-limits -Wuninitialized -Wno-deprecated -Wstrict-aliasing -Wpointer-arith -Wpointer-arith -Wformat-nonliteral -Winit-self -Wparentheses -Wunreachable-code -ggdb3 | COMPILE_DEFINES =  | STATICCOMPILE = OFF | ONLY_SIMPLE =  | Boost_FOUND =  | ZLIB_FOUND =  | ENABLE_TESTING =  | ENABLE_ASSERTIONS = ON | MY_TARGETS =  | compilation date time = Jul 15 2022 16:57:55
c Solving track3_012.cnf
c variables (all/used/free):    6271/6271/0
c clauses (all/long/binary/unit): 18729/7536/11184/9
c Sampling set is present, performing projected model counting 
c Sampling set size: 40
c Sampling set:  18 310 365 794 850 1087 1133 1378 1579 1865 2107 2225 3036 3082 3157 3404 3441 3521 3558 3621 4050 4066 4325 4362 4378 4388 4425 4562 4691 4747 4782 5031 5214 5334 5429 5604 5754 5896 5975 6260
c Preprocessing ..
c DONE
c variables (all/used/free):    4814/4783/31
c clauses (all/long/binary/unit): 13978/4706/9272/0
c Maximum cache size:   4000 MB
[...] 
c # solutions 
s pmc 18766
c # END
c time: 63.7687s

The output from CryptoMiniSat:

soos@tiresias:build$ ../../cryptominisat/build/cryptominisat5 --maxsol 200000 --verb 0 --onlysampling simplified-track3 | grep "^s SATIS" | wc -l
18718

Which is the correct count.

I just spent an entire day on this, and we may get our results thrown out of the competition due to this bug :S Even though we are winning by a wide margin (best next solver is 75 vs our 85, basically, we are crushing it). So this is frustrating. I am going to debug this, but we should try to do better next time.

In essence, we need a fuzzer to check for such cases. It's the only way to keep our system reliable in the long run. I have been lazy, and I haven't done it, but it's time to do it. It'd help with GANAK, ApproxMC, and any future counting systems we build. So I'll take the blame for this (twice, for not properly listening to @latower and not building a fuzzer).

Thanks again @latower and let's hope I can fix both now :)

msoos commented 7 months ago

Thanks for this. It made me rewrite the whole of GANAK, essentially. Current version has no bug in unweighted counting, and the weighted system is currently unsupported, unless you really need it. In that case, I'll add it to the upcoming version. Let me know how urgent it is.

Thanks again for this bugreport. Made me realize what a mess GANAK is, and eventually made me rewrite the whole thing :)

Mate

meelgroup / ganak

Incorrect weighted model count because of learn-and-start-over? #16