Support exactly-k-out-of-N constraints

sfiruch commented 8 years ago

Klieber & Kwon show a nice & simple formulation in "Efficient CNF Encoding for Selecting 1 from N Objects", 2007 (https://www.cs.cmu.edu/~wklieber/papers/2007_efficient-cnf-encoding-for-selecting-1.pdf) of k-of-N constraints.

At the moment I work by combining AND & AtMost constraints, which lead to "meh" performance. Perhaps a better formulation would help, and a nice interface for the above-mentioned algorithm would make it so much easier for users.

NikolajBjorner commented 8 years ago

I appreciate the reference. Is the application using the SAT backend or the SMT engine? They get compiled differently.

Kleiber and Kwon consider at-most-1 constraints using commander variables. It does not address at-most-k constraints, where I currently use bit-vector encodings for bit-blasting. This requires O(n log n) bits. The encodings we use for a single bit is specialized to use a carry-bit, which leads to also an O(n) size, but different propagators. It is very possible that Klieber&Kwon provides better propagation. The SMT based backend uses Pseudo-Boolean inequalities that get compiled on demand to clauses.

sfiruch commented 8 years ago

I'd like to use the encoding for the SMT engine, not just the SAT backend. I often use Z3 for various optimization problems, because it is much more flexible than a pure SAT solver. However, encoding 1-of-k constraints (which occur often in general optimization problems) are probably not encoded optimally.

NikolajBjorner commented 8 years ago

let me know if you observe any differences. Currently the SAT backend also leverages cutting planes better to pre-process soft constraints, the SMT solver currently loses some pre-processing opportunities if you use Pseudo-Boolean inequalities.

sfiruch commented 8 years ago

Here are several anectotal observations, when working with small sets of variables (5-30). I tested each approach with three instances of a problem, reported times are instance averages, with a time limit of 1500s. I used the default tactic in each case.

Without commander variables

1320s: Encoding with Z3's AtMostK/OR combination
239s: Using a encoding in this style (A & !B & !C & !D) | (!A & B & !C & !D) | (!A & !B & C & !D) | (!A & !B & !C & D)
N/A: Using pairwise encodings à la (!A | !B) & (!A | !C) & (!A | !D) & (!B | !C) & (!B | !D) & (!C | !D) couldn't solve any instances.

Commander groups of size 3

405s: Using commander variables with Z3's AtMostK/OR constraints for groups
170s: Using commander variables, using the second encoding for groups
N/A: Using commander variables with a pairwise encoding couldn't solve any instances.

Commander groups of size 4

67s: Using commander variables, using the second encoding for groups

Klieber & Kwon describe the choice of group size 3 as optimal, but in my tests I found groups of 4 elements to be optimal.

Here is the code I used for my experiments:

//implementation of https://www.cs.cmu.edu/~wklieber/papers/2007_efficient-cnf-encoding-for-selecting-1.pdf
int cmderIdx=0;
BoolExpr ProcessGroup(Solver _s, Context _ctx,BoolExpr[] _expr)
{
    var commander = _ctx.MkBoolConst($"c[{cmderIdx++}]");
    _s.Assert(_ctx.MkOr(_expr.Select(sel => _ctx.MkAnd(_expr.Where(e => e!=sel).Select(e => _ctx.MkNot(e)))))); //1
    _s.Assert(_ctx.MkOr(_ctx.MkNot(commander),_ctx.MkOr(_expr))); //2
    _s.Assert(_ctx.MkOr(commander, _ctx.MkNot(_ctx.MkOr(_expr)))); //3
    return commander;
}

void ExactlyOneOf(Solver _s, Context _ctx, BoolExpr[] _expr)
{
    switch (_expr.Length)
    {
        case 1:
            _s.Assert(_expr[0]);
            break;
        case 2:
            _s.Assert(_ctx.MkXor(_expr[0],_expr[1]));
            break;
        case 3:
        case 4:
            _s.Assert(_ctx.MkOr(_expr.Select(sel => _ctx.MkAnd(_expr.Select(e => e == sel ? e : _ctx.MkNot(e))))));
            return;
        default:
            var groups = new BoolExpr[(_expr.Length + 3) / 4][];
            for (var i = 0; i < _expr.Length; i += 4)
                groups[i / 4] = _expr.Skip(i).Take(4).ToArray();

            ExactlyOneOf(_s, _ctx, groups.Select(g => ProcessGroup(_s, _ctx, g)).ToArray());
        break;
    }
}

jwaldmann commented 8 years ago

For reference, here is a comparison of several at-most-one encodings (Hölldobler and Nguyen, 2013) http://www.wv.inf.tu-dresden.de/Publications/2013/report-13-04.pdf

NikolajBjorner commented 8 years ago

I found the following paper by Abio et.al. in CP13 rather nice: http://link.springer.com/chapter/10.1007%2F978-3-642-40627-0_9. But it doesn't produce the best encoding for at-most-1 as pointed out in this thread. Also, in principle sorting networks are producing more information than necessary, so it makes one wonder if anything was done for circuit encodings of linear time k-median algorithms.

NikolajBjorner commented 8 years ago

I have no idea how to reproduce the timings you report because I don't have the benchmarks you refer to. I have made an overhaul of the two back-ends that process cardinality constraints. There is a dedicate solver that is retrieved using "QF_FD" now as well. To exercise at-most-1 constraints I added a python example: ec5d4f111970ca74bacba12243f681e4a52660d5. I did not see an overly profound performance difference for the different variants (sorting circuits, bimander and commander).

sfiruch commented 8 years ago

Here's a toy example: MagicSquare.cs

You can play around by using other encodings in the ExactlyOneOf method.
Larger/smaller instances can be created by changing N.
The example contains DIMACS output in the Output method, for tests with SAT solvers.
This is just an example, not the best-known encoding of the problem

NikolajBjorner commented 8 years ago

Very nice. I learned a few things from this example. Most noteworthy is that when you have constraints of the form x1 + ... + xN = 1, then it is very useful to compile the entire expression as a unit and not treat it as a conjunction x1 + ... + xN <= 1 and x1 + ... + xN >= 1 (the latter is a disjunction). The auxiliary disjunctions created when compiling x1 + ... + xN <= 1 can be recycled for the lower bound and furthermore enable better propagation among the partitions. I am therefore exposing PbEq over the API (Z3_mk_pbeq) and treating it as a special case. The timing is getting closer to the custom expansion, but the translation is still slightly wasteful in that it includes some extra variables that later are asserted as units.

NikolajBjorner commented 6 years ago

There are now several implementations of these encodings as well as native PB and cardinality solvers in the SAT solver.

Z3Prover / z3