cjdrake / pyeda

Python EDA
BSD 2-Clause "Simplified" License
304 stars 54 forks source link

added linear-growing cardinality constraints: AtLeastK, AtMostK, ExactlyK #128

Closed sschnug closed 7 years ago

sschnug commented 9 years ago

This pull request is a prototype-implementation of those cardinality-constraints mentioned in issue #127.

State

It won't work in this state! It fails when using ExactlyK with k=1 as OneHot alternative in the sudoku example. The reason is unclear to me, but i think the newly created variables are not globally introduced. The following code demonstrates the problem (and was only observed after putting this code into pyeda; see next paragraph):

from pyeda.boolalg.expr import And, OneHot, ExactlyN, AtMostN, AtLeastN, expr2dimacscnf
from pyeda.boolalg.bfarray import exprvars

x = exprvars('x', 2)
r_0 = OneHot(*[x[i] for i in range(2)])
r_1 = ExactlyN(*[x[i] for i in range(2)]) # UNSAT which is wrong; uses AtMostN&AtLeastN internally!
r_2 = AtMostN(*[x[i] for i in range(2)])
r_3 = AtLeastN(*[x[i] for i in range(2)])
print(r_0)
print(r_1)
print(r_2)
print(r_3)
print(r_0.satisfy_one())
print(r_1.satisfy_one())
print(r_2.satisfy_one())
print(r_3.satisfy_one())

Interesting observation / Working case

While implementing and before adding to the pyeda library, i tested with the following working code (can be tested without the commit; code looks different because of some cleaning-steps, but it is the same; works on lists of variables), which let's me think, that the logic of these algorithms are correct:

from pyeda.inter import *
from pyeda.boolalg import exprnode
from pyeda.boolalg.expr import _expr
import itertools
import random
random.seed(1)

def AtMostN(*xs, k=1, auxvarname='less_or_equal'):
    """
    Return an expression that means
    "at most k input functions are true".
    Returns cnf

    k >= 1
    len(xs) > k
    """
    xs = [Expression.box(x).node for x in xs]
    n = len(xs)
    terms = list()
    aux_vars = list(map(lambda x: x.node, exprvars(auxvarname, (n-1)*k)))

    def s(i,j,k):
        return aux_vars[i*k+j]

    # case: i=1
    terms.append(exprnode.or_(exprnode.not_(xs[0]), s(0,0,k)))
    for i in range(1,k):
        terms.append(exprnode.not_(s(0,i,k)))

    # general case
    for i in range(1,n-1):
        terms.append(exprnode.or_(exprnode.not_(xs[i]),s(i,0,k)))
        terms.append(exprnode.or_(exprnode.not_(s(i-1,0,k)), s(i,0,k)))
        for u in range(1,k):
            terms.append(exprnode.or_(exprnode.not_(xs[i]), exprnode.not_(s(i-1,u-1,k)), s(i,u,k)))
            terms.append(exprnode.or_(exprnode.not_(s(i-1,u,k)), s(i,u,k)))
        terms.append(exprnode.or_(exprnode.not_(xs[i]), exprnode.not_(s(i-1,k-1,k))))

    terms.append(exprnode.or_(exprnode.not_(xs[-1]), exprnode.not_(s(n-2,k-1,k))))

    y = exprnode.and_(*terms)
    return _expr(y)

# def AtLeastN_naive(*xs, k=1):
#     xs = [Expression.box(x).node for x in xs]
#     n = len(xs)
#     return _expr(exprnode.or_(*[exprnode.and_(*i) for i in itertools.combinations(xs,k)]))

def AtLeastN(*xs, k=1, auxvarname='greater_or_equal'):
    """
    Return an expression that means
    "at least k input functions are true".
    Returns cnf
    """
    xs = [_expr(exprnode.not_(Expression.box(x).node)) for x in xs]   # negated
    n = len(xs)
    return AtMostN(*xs, k=n-k, auxvarname=auxvarname)

def ExactlyN(*xs, k=1, auxvarname='equal'):
    """
    """
    xs = [_expr(Expression.box(x).node) for x in xs]
    return And(AtMostN(*xs, k=k), AtLeastN(*xs,k=k))

def test_atmostn(runs=10000):
    for r in range(runs):
        print(r)
        n_vars = random.randint(2,100)
        value = random.randint(1,n_vars-1)

        print(('Instance', n_vars, value))

        x = exprvars('x', n_vars)

        MAX = AtMostN(*[x[i] for i in range(n_vars)], k=value, auxvarname='less_or_equal')
        MIN = AtLeastN(*[x[i] for i in range(n_vars)], k=value, auxvarname='greater_or_equal')

        BOTH = And(MAX,MIN)
        BOTH_NEW = ExactlyN(*[x[i] for i in range(n_vars)], k=value, auxvarname='equal')
        result = MIN.satisfy_one()
        assert result
        result = MAX.satisfy_one()
        assert result
        result = BOTH.satisfy_one()
        assert result
        result = BOTH_NEW.satisfy_one()
        assert result

        count_trues = 0
        for i in x:
            if result[i] == 1:
                count_trues += 1
        print(n_vars, value, count_trues)
        assert value == count_trues

test_atmostn()

Other remarks

To think about

Maybe someone may take a look at my code and debug the problems observed.

cjdrake commented 9 years ago

I added your branch to cjdrake/pyeda. Can't write to your branch, so I'm making some edits to mine.

cjdrake commented 9 years ago

Made some superficial changes just now.

Will you please write a couple unit tests for these functions? Just look in the pyeda/boolalg/test/test_expr.py module for examples.

Reading a few tests will help me understand the functions a bit better.

sschnug commented 9 years ago

Good news: i found the reason for the problem described above. (not yet fixed in code!)

Like assumed, the introduction of the auxiliary was the reason! Because i wanted my function-arguments to mimick the ones of OneHot and co. i introduced the auxvarname-argument. After debugging the newly introduced unit-tests & rereading part of the docs, it seems that a variable-object is completely defined by it's name at the time of creation! This is the source of the problems described above!

Easy case:

Not that easy case:

""" The following is the pyeda sudoku example using ExactlyN() instead of OneHot()
    This is only a proof-of-concept: it will be much slower than using naive OneHot.
    Reasons:
    - Domain-range small enough to use the naive approach (exponential-growth)
    - Instead of using Or(*xs) to formulate >= 1, this code uses <= k-1 of the negated values (AtLeastN())
"""

from pyeda.boolalg.expr import And, OneHot, NHot, ExactlyN, AtMostN, AtLeastN, expr2dimacscnf
from pyeda.boolalg.bfarray import exprvars
import itertools

gen = itertools.count()

DIGITS = "123456789"

X = exprvars('x', (1, 10), (1, 10), (1, 10))

V = And(*[
        And(*[
            ExactlyN(*[X[r,c,v] for v in range(1, 10)], k=1, auxvarname='AUX_'+str(next(gen)))
                     for c in range(1, 10)]
        )
        for r in range(1, 10)]
    )

R = And(*[
        And(*[
            ExactlyN(*[X[r,c,v] for c in range(1, 10)], k=1, auxvarname='AUX_'+str(next(gen)))
                     for v in range(1, 10)]
        )
        for r in range(1, 10)]
    )

C = And(*[
        And(*[
            ExactlyN(*[X[r,c,v] for r in range(1, 10)], k=1, auxvarname='AUX_'+str(next(gen)))
                     for v in range(1, 10)]
        )
        for c in range(1, 10)]
    )

B = And(*[
        And(*[
            ExactlyN(*[
                X[3*br+r,3*bc+c,v]
                    for r in range(1, 4)
                    for c in range(1, 4)], k=1, auxvarname='AUX_'+str(next(gen))
            )
            for v in range(1, 10)]
        )
        for br in range(3) for bc in range(3)]
    )

S = And(V, R, C, B)

# This step does absorption
S = S.to_cnf()

def parse_grid(grid):
    chars = [c for c in grid if c in DIGITS or c in "0."]
    assert len(chars) == 9 ** 2
    return And(*[ X[i // 9 + 1, i % 9 + 1, int(c)]
                  for i, c in enumerate(chars) if c in DIGITS ])

grid = ( ".73|...|8.."
         "..4|13.|.5."
         ".85|..6|31."
         "---+---+---"
         "5..|.9.|.3."
         "..8|.1.|5.."
         ".1.|.6.|..7"
         "---+---+---"
         ".51|6..|28."
         ".4.|.52|9.."
         "..2|...|64." )

def get_val(point, r, c):
    for v in range(1, 10):
        if point[X[r,c,v]]:
            return DIGITS[v-1]
    return "X"

def display(point):
    chars = list()
    for r in range(1, 10):
        for c in range(1, 10):
            if c in (4, 7):
                chars.append("|")
            chars.append(get_val(point, r, c))
        if r != 9:
            chars.append("\n")
            if r in (3, 6):
                chars.append("---+---+---\n")
    print("".join(chars))

def solve(grid):
    with parse_grid(grid):
        return S.satisfy_one()

bla = solve(grid)
display(bla)
cjdrake commented 9 years ago

A couple things.

Glad you found the problem. Yes, I designed the variable constructors to always return the same instance given the same name and index. I just thought it would end up being confusing otherwise, and didn't want to try and overload the == operator b/c that would cause all kinds of chaos.

That said, you might not actually need to use the exprvars function. The exprvar function can create variables with multiple levels of names, and multiple indices.

For example:

>>> foo = exprvar(('a', 'b', 'c'), (2, 3, 4))
>>> foo
c.b.a[2,3,4]
>>> bar = exprvar(('a', 'b', 'c'), (2, 3, 4))
>>> foo is bar
True

Also, I think the current function naming convention is confusing. Instead of AtMostN, perhaps we should call it AtMostK instead? Given the math formulas you cited, I think it make a lot of sense to stick with the convention that N represents the number of input arguments, and k is some number between 1 and N.

cjdrake commented 9 years ago

I have to admit, functions that produce new Literal objects has always been a bit of a challenge.

The way the Tseitin transformation works is to just start with index 0 with an empty constraint list. Then just keep incrementing the index as you continue to append new constraints. Fortunately, it's not a nestable operation, so you don't have to worry about the problem you described.

Rather than using a global variable, I think you might want to try the approach of adding new name prefixes. Like at the top level start with auxvarname = 'a', then at nesting level 1 using b.a, then c.b.a, and so forth. When you actually look at the expression, it will be nice to tell the variables apart at different nesting levels.