msoos / cryptominisat

An advanced SAT solver
https://www.msoos.org
Other
803 stars 179 forks source link

Small test for irred duplicate removal: but get an assert in stats - hint to solve this? #133

Closed capiman closed 10 years ago

capiman commented 10 years ago

I have added a small check to detect irred duplicates and discard them. Unfortunately i get an assert with stats:

ERROR: ->numLitsIrred: 1278 ->litStats.irredLits: 1152 Assertion failed!

Program: D:\cryptominisat-26112013\cmsat\cryptominisat64.exe File: solver.cpp, Line 3392

Expression: numLitsIrred == litStats.irredLits

This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.

This points to assert(numLitsIrred == litStats.irredLits); in Solver::checkStats

This is what i have changed: (still a lot of debug output inside, but commented out) (it is based on current version of sources)

void Solver::attachClause(
    const Clause& cl
    , const bool checkAttach
) {
    #if defined(DRUP_DEBUG) && defined(DRUP)
    if (drup) {
        for(size_t i = 0; i < cl.size(); i++) {
            *drup << cl[i];
        }
        *drup << fin;
    }
    #endif

#if 1 // MM
    if (!cl.red()) {
        size_t clauses_to_check = 0;
        size_t clauses_to_check_right_size = 0;
        size_t clauses_to_check_wrong_size = 0;

        // cout << "c Here is my check for duplicate clauses..." << endl;

        for(vector<ClOffset>::const_iterator
            it = longIrredCls.begin(), end = longIrredCls.end()
            ; it != end
            ; it++
        ) {
            clauses_to_check++;
            Clause* cl2 = clAllocator->getPointer(*it);
            if (cl.size() == cl2->size()) {
                size_t i;

                clauses_to_check_right_size++;

                for(i = 0; i < cl.size(); i++) {
                    size_t j;

                    // cout << "i=" << i << " cl.size()=" << cl.size() << endl;

                    for(j = 0; j < cl2->size(); j++) {

                        // cout << "j=" << j << " cl2->size()=" << cl2->size() << endl;

                        // cout << "c Check " << cl[i] << " against " << (*cl2)[j] << endl;
                        if(cl[i] == (*cl2)[j]) {
                            // cout << "c -> equal" << endl;
                            break;
                        }
                        else {
                            // cout << "c -> NOT equal" << endl;
                        }
                    }
                    if(j == cl2->size())
                        break;
                }
                if(i == cl.size()) {
                    cout << "c Duplicate1: " << cl << endl;
                    cout << "c Duplicate2: " << (*cl2) << endl;
                    cout << "c Duplicate clause!!! -> Discard" << endl;
                    return;
                }
            }
            else {
                clauses_to_check_wrong_size++;
            }
        }

        // cout << "c Clauses checked:            " << clauses_to_check            << endl;
        // cout << "c Clauses checked right size: " << clauses_to_check_right_size << endl;
        // cout << "c Clauses checked wrong size: " << clauses_to_check_wrong_size << endl;
    }
#endif // MM

    //Update stats
    if (cl.red())
        litStats.redLits += cl.size();
    else
        litStats.irredLits += cl.size();

    //Call Solver's function for heavy-lifting
    PropEngine::attachClause(cl, checkAttach);
}

Do i need to add my "check and discard" directly into Solver::addClauseInt or perhaps even higher to solve the problem with stats?

msoos commented 10 years ago

Actually, this should work! (good job!) I'm a bit surprised... I'll look through this later. The problem is that the litStats.irredLits don't seem to match up with the number of literals of all the clauses in the database. Manipulating the attachClause function is a bit hairy, though, as the clause might already be .push()-ed to the irredLongClauses and returning from Solver::attachClause might not be enough. Check where Solver::attachClause is being called. I think it might be called from all over the place, like ClauseVivifier and Simplifier which might assume certain things.

capiman commented 10 years ago

(First time i am using gdb to list where function is called from, under Windows, 64 bit exe)

The clause seems to be deleted from "Searcher::add_otf_subsume_long_clauses".

0 KillMe () at solver.cpp:66

1 0x00000000004446df in CMSat::Searcher::add_otf_subsume_long_clauses (this=this@entry=0x2f16fa0) at searcher.cpp:1264

2 0x000000000044abb8 in CMSat::Searcher::handle_conflict (this=this@entry=0x2f16fa0, confl=..., confl@entry=...)

at searcher.cpp:1559

3 0x000000000044b0da in CMSat::Searcher::search (this=this@entry=0x2f16fa0) at searcher.cpp:1029

4 0x000000000044b5f6 in fu816ZSt4cout () at searcher.cpp:2235

5 0x000000000041aebd in fu257ZSt4cout () at solver.cpp:1658

6 0x0000000000402d0e in fu1140ZTVSt9basic_iosIcSt11char_traitsIcEE () at main.cpp:946

7 0x00000000004dded3 in main (argc=2, argv=0x22500) at main.cpp:1036

The assert in CheckStats() seems to be called from

2 0x0000000000412c01 in CMSat::Solver::extend_solution (this=this@entry=0x2f16fa0) at solver.cpp:1571

capiman commented 10 years ago

Unfortunately the check is far too slow for general use, but it works for smaller CNFs to see which part of the code the duplicate clauses generates. Here is an example where I used the CNF which you used recently on your homepage: (in above source code i had replaced the "return" by an empty function KillMe() { }, just to be able to trigger in gdb when duplicate is detected and not discarding the duplicate clauses)

D:\cryptominisat-26112013\cmsat>gdb --args cryptominisat64test1 d:\UTI-20-10p0.cnf
GNU gdb (rubenvb-4.7.4-release) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
For bug reporting instructions, please see:
<mingw-w64-public@lists.sourceforge.net>...
Reading symbols from D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe...done.
(gdb) start
Temporary breakpoint 1 at 0x4ddf20: file main.cpp, line 1029.
Starting program: D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe d:\UTI-20-10p0.cnf
[New Thread 6724.0x197c]

Temporary breakpoint 1, main (argc=2, argv=0x22520) at main.cpp:1029
1029    {
(gdb) break KillMe
Breakpoint 2 at 0x409470: file solver.cpp, line 66.
(gdb) continue
Continuing.
c Outputting solution to console
c CryptoMiniSat version MinGW64-compiled, without GIT
c compiled with gcc version 4.7.3
c Executed with command line:D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe d:\UTI-20-10p0.cnf
c Reading file 'd:\UTI-20-10p0.cnf'
c -- header says num vars:         259616
c -- header says num clauses:     1374599
c -- clauses added:            0 redundant      1374599 irredundant
c -- vars added     259616
c Parsing time: 30.98 s
c clause size stats. size4: 7260 size5: 29040 larger: 29060
c [clean] T: 0.0930 s
c Doing burst search for 300 conflicts
c Duplicate1: 138778 138657 -139257 138940 138796
c Duplicate2: 138778 138657 -139257 138940 138796
c Duplicate clause!!! -> Discard

Breakpoint 2, KillMe () at solver.cpp:66
66      {
(gdb) bt
#0  KillMe () at solver.cpp:66
#1  0x000000000041c05a in __fu268__ZSt4cout () at solver.cpp:447
#2  0x000000000044474f in CMSat::Searcher::add_otf_subsume_long_clauses (this=this@entry=0x2fa6fa0) at searcher.cpp:1264
#3  0x000000000044ac28 in CMSat::Searcher::handle_conflict (this=this@entry=0x2fa6fa0, confl=..., confl@entry=...)
    at searcher.cpp:1559
#4  0x000000000044b14a in CMSat::Searcher::search (this=this@entry=0x2fa6fa0) at searcher.cpp:1029
#5  0x000000000044b33d in __fu812__ZSt4cout () at searcher.cpp:1673
#6  0x000000000044b49c in CMSat::Searcher::solve (this=this@entry=0x2fa6fa0, _maxConfls=31000) at searcher.cpp:2210
#7  0x000000000041aead in __fu257__ZSt4cout () at solver.cpp:1683
#8  0x0000000000402d0e in __fu1140__ZTVSt9basic_iosIcSt11char_traitsIcEE () at main.cpp:946
#9  0x00000000004ddfd3 in main (argc=2, argv=0x22520) at main.cpp:1036

I have done the same example with "--otfsubsume=0" to avoid the position, where the duplicate was generated. But then the duplicates are generated at a different position:

D:\cryptominisat-26112013\cmsat>gdb --args cryptominisat64test1 --otfsubsume=0 d:\UTI-20-10p0.cnf
GNU gdb (rubenvb-4.7.4-release) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-w64-mingw32".
For bug reporting instructions, please see:
<mingw-w64-public@lists.sourceforge.net>...
Reading symbols from D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe...done.
(gdb) start
Temporary breakpoint 1 at 0x4ddf20: file main.cpp, line 1029.
Starting program: D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe "--otfsubsume=0" d:\UTI-20-10p0.cnf
[New Thread 2876.0x1cc]

Temporary breakpoint 1, main (argc=3, argv=0x19325a0) at main.cpp:1029
1029    {
(gdb) break KillMe
Breakpoint 2 at 0x409470: file solver.cpp, line 66.
(gdb) continue
Continuing.
c Outputting solution to console
c CryptoMiniSat version MinGW64-compiled, without GIT
c compiled with gcc version 4.7.3
c Executed with command line:D:\cryptominisat-26112013\cmsat\cryptominisat64test1.exe --otfsubsume=0 d:\UTI-20-10p0.cnf
c Reading file 'd:\UTI-20-10p0.cnf'
c -- header says num vars:         259616
c -- header says num clauses:     1374599
c -- clauses added:            0 redundant      1374599 irredundant
c -- vars added     259616
c Parsing time: 30.70 s
c clause size stats. size4: 7260 size5: 29040 larger: 29060
c [clean] T: 0.0930 s
c Doing burst search for 300 conflicts
c 300-long burst search  learnt units:109 learnt bins: 45
c Chose restart type glue-based
c Using var act-multip 11 instead of standard 11 and act-divider 10 instead of standard 10
c     1   315  181707   51K  281K  724K   6.1   146    15 10487 100.3
c newZeroDepthAss : 21332
c [clean] T: 0.0780 s
c new bins since last SCC: 1024919 free vars %:564.05
c [clean] T: 0.0780 s
c SCC new: 6798 T: 0.19 s
c [clean] T: 0.0630 s
c Duplicate1: 6975 6982 7315 7316 7319
c Duplicate2: 6975 6982 7315 7316 7319
c Duplicate clause!!! -> Discard

Breakpoint 2, KillMe () at solver.cpp:66
66      {
(gdb) bt
#0  KillMe () at solver.cpp:66
#1  0x000000000041c05a in __fu268__ZSt4cout () at solver.cpp:447
#2  0x00000000004220af in CMSat::VarReplacer::handleUpdatedClause (this=this@entry=0x2edec70, c=..., origLit1=...,
    origLit2=...) at varreplacer.cpp:623
#3  0x0000000000422596 in CMSat::VarReplacer::replace_set (this=this@entry=0x2edec70, cs=...) at varreplacer.cpp:548
#4  0x0000000000422a92 in __fu391__ZSt4cout () at varreplacer.cpp:198
#5  0x00000000004411f8 in __fu740__ZSt4cout () at searcher.cpp:2125
#6  0x000000000044b5cb in __fu816__ZSt4cout () at searcher.cpp:2242
#7  0x000000000041aead in __fu257__ZSt4cout () at solver.cpp:1683
#8  0x0000000000402d0e in __fu1140__ZTVSt9basic_iosIcSt11char_traitsIcEE () at main.cpp:946
#9  0x00000000004ddfd3 in main (argc=3, argv=0x19325a0) at main.cpp:1036
msoos commented 10 years ago

This is not the proper way to discard duplicates. It's cheaper to add them and then the included simplification passes remove them in one go, much faster, and in a more efficient manner. I really think it's better to just run the solver and wait for the simplification pass to start and deal with duplicate clauses. The above solution will run worst-case O(n^2) while most of the passes will run much-much faster, often something like O(n_log(n/k)_k) where k is the number of variables and n is the number of clauses. That's significantly faster than the above algorithm. Closing.