meelgroup / approxmc

Approximate Model Counter
Other
70 stars 26 forks source link

fix corner case of binary search #41

Closed AL-JiongYang closed 3 months ago

AL-JiongYang commented 11 months ago

The PR fixes the error discussed here: #40, where ApproxMC gets stuck in infinite loops.

The original issue triggered the error when the generated XORs were unfortunately always empty, and therefore, we can always find threshold solutions and increase the number of XORs up to n, aka the number of variables. However, the ApproxMC unexpectedly gets stuck in infinite loops at n-1 XORs.

The error is due to a common corner case in binary search, as shown at this line: https://github.com/meelgroup/approxmc/blob/8ab5426e690aeb64469904c6a425a179a8da78c9/src/counter.cpp#L630 When we approach the last step of binary search: lowerFib + 1 = upperFib, the integer division always returns lowerFib and discards the fraction (i.e., 0.5). As a result, lowerFib is always assigned to hashCount, and hashCount can never visit upperFib. That's why ApproxMC gets stuck at n-1 XORs when lowerFib = n-1 and upperFib = n.

However, the corner case doesn't lead to an error in other cases of lowerFib + 1 = upperFib. This is because the new value of upperFib is always guaranteed to have been visited before the assignment. See the visit: https://github.com/meelgroup/approxmc/blob/8ab5426e690aeb64469904c6a425a179a8da78c9/src/counter.cpp#L583, and the following upperFib updates: https://github.com/meelgroup/approxmc/blob/8ab5426e690aeb64469904c6a425a179a8da78c9/src/counter.cpp#L589 https://github.com/meelgroup/approxmc/blob/8ab5426e690aeb64469904c6a425a179a8da78c9/src/counter.cpp#L595. The property always holds except for the initial value of upperFib at this line: https://github.com/meelgroup/approxmc/blob/8ab5426e690aeb64469904c6a425a179a8da78c9/src/counter.cpp#L528, where the total_max_xors is equal to n. That's why we only encountered this error when upperFib = n.

Therefore, the fix is to add the above property back to the initial value of upperFib. We assume that when the number of XORs is n, we can only find one solution because there are at most 2^n solutions for n variables, as shown in the submitted commit.

msoos commented 11 months ago

Please get the fuzzer from Anna and run it against this ApproxMC for at least 10k rounds.

Please also run 10k fuzz runs also under Valgrind.

I have a feeling this fix is wrong in more ways than one.

AL-JiongYang commented 11 months ago

I'll do that, although I think the bug (and the fix) is very unlikely to trigger in practice. For example, given 7 variables, the probability of obtaining 7 empty XORs is 1 / (2^8)^7 = 1/2^56.

msoos commented 11 months ago

It's not about the edge case you & Yong Kiam thought about. It's about the edge cases you didn't think about that are affected by this change.

That algorithm is quite obfuscated and very complicated to get right. 2 people have been trying to get it right for 4 years and there are still bugs. So it may be hard to think through its edge cases.

On Mon, Sep 25, 2023, 14:16 Jiong Yang @.***> wrote:

I'll do that, although I think the bug (and the fix) is very unlikely to trigger in practice. For example, given 7 variables, the probability of obtaining 7 empty XORs is 1 / (2^8)^7 = 1/2^56.

— Reply to this email directly, view it on GitHub https://github.com/meelgroup/approxmc/pull/41#issuecomment-1733585463, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKF4OP4RWPWAZATCGGSWI3X4FYZBANCNFSM6AAAAAA5FX3HEM . You are receiving this because you commented.Message ID: @.***>

kuldeepmeel commented 3 months ago

Now that we have the algorithm certified; I think this should be good to be accepted.

msoos commented 3 months ago

Merged, thanks! I'll run the fuzzer to check :)

msoos commented 3 months ago

The fuzzer found that this lead to an UNSAT->SAT bug when AppMC was called with an unsat problem. Anyway, fixed!