sympy / sympy

A computer algebra system written in pure Python
https://sympy.org/
Other
12.98k stars 4.44k forks source link

Implement faster combinatorial enumeration algorithms #6662

Open pkrathmann2 opened 11 years ago

pkrathmann2 commented 11 years ago

Several of the functions in utilities/iterables.py use algorithms which generate objects (permutations, etc) from a large class, and then filter to return just those satisfy some condition. In many cases, there are much faster algorithms available.

The ones I am aware of are multiset_partitions, derangements, and involutions, but I suspect there are more.

I have coded the core of a faster algorithm for multiset_partitions, but am filing this issue first to verify that this is a worthwhile change, and also to check what plans there are/should be for the other ones.

(If this discussion would be better on the group/mailing list, let me know and I will move it over there.)

Derangements has a TODO comment to implement an ECO version. This is probably the one case where a generate-and-filter approach is pretty good. In the limit, about 37% (1/e) of permutations are derangements. (At least for the set case -- I don't know how the ratio works for for multisets.) At any rate, if the generation is fast, the filter doesn't multiply the cost too badly.

Involutions are far more rare -- for n=12, only about 1 in 3400 permutations is an involution. Hence, directly generating involutions could make a huge difference.

I am attaching a first pass at an implementation of an enumerator for multiset partitions. I am pretty confident on the correctness and performance of the base algorithm, but would appreciate input on how best to integrate it into sympy.

In particular -

Original issue for #6662: http://code.google.com/p/sympy/issues/detail?id=3563 Original author: https://code.google.com/u/107622745457661818556/



______
see also [this](http://v.vincent.u-bourgogne.fr/0ABS/0ART/der_end_end.pdf) for derangements
asmeurer commented 11 years ago

This is absolutely useful. Not only are these algorithms used by users, but many are used internally in SymPy itself, so having faster versions is always better.

Regarding your points:

pkrathmann2 commented 11 years ago

A few things I have thought of since my first post-

I was thinking I could have multiset_partitions_taocp() just take a list (array) of multiplicities. Then all element handling would be delegated to the visitor/calling routines.

As indicated in the benchmarks that I include in the attached file, the general function is 4-5 times slower than the special-purpose routines on the set or integer partitions case, but it can be a lot faster in the general case. So presumably the thing to do is to wire it into the general routine, alongside the other routines, but replacing the filtering code.

For derangements, etc., I was thinking that it might be fun to implement a class which does a stack-based implementation of the most basic possible permutation iterator -- place the first element in one of the n available positions, place the second in one of the n-1 remaining, and so on. (And iterate at each level through all the available choices.)

The base class wouldn't be useful, since we already have a very fast permutation iterator in the Python library, but it would be easy to subclass to add in restrictions. For example, derangements would skip placing an element on its original position, while involutions would assign the the target element, when the base element is assigned.

I figure that this approach would never be as fast as the more optimized algorithms, but it should be a big step up from generate-and-filter, and provides an easy path to implementing some of the more unusual permutation subclasses that Sympy provides.

(If the above is unclear, let me know.)

The function names are just placeholders. A

smichr commented 11 years ago

Naming - should I just put underscores in front of multiset_partitions_taocp and its visitor functions, and then wire it in to the multiset_partitions function? Or, are there other ways to manage the namespace? Any input on function names, etc., would be welcome.

I would underscore and wire it in, calling it from the end of multiset_partitions where it is known that you are dealing with a multiset.

Should this go in iterables? That file is already more than 2000 lines, and this would expand it noticeably.

I think all the combinatorial-type functions should be moved from iterables to, perhaps, combinatorics/enumation.py (a new file). I added some functions (as part of a PR that I have open) to functions/combinatorial/numbers.py; those functions could probably live there, too.

Interface? I represent a multiset as a list of (element, multiplicity) pairs. In other places in sympy, multisets are represented as maps or as lists with duplicates. Should we standardize on one representation? One advantage for a list of pairs, especially for a low-level routine like this, is that the position in the list gives a well defined ordering, without having to look at the elements themselves.

I think the group(foo, False) form is good for the more general routines. I'm not sure I would change other low-level routines (like partitions), however. partitions, for example, needs to be able to quickly change the values and having items in a tuple would make that more cumbersome.

Level of code. Knuth's psuedocode translates into rather low-level Python. Don't know if there is much to be done about it, short of a full reimplementation.

I wouldn't rewrite it -- I like being able to look at a reference and an implementation and see the correspondence, especially when it's low level.

Documentation - I put in a rather long introductory comment to introduce the algorithm and data structures. I don't want to just point the reader to Knuth, since his description is rather terse. But, is this level of commenting going overboard?

I don't think so. I liked it.

BTW, the PR that I did now implements enumeration counting that is significantly faster than the enumeration for combinations, permutations and partitions. e.g. generating all k-sized partitions of 'mississippi' takes about 0.8 seconds with the current implementation whereas generating the partitions takes over 20 seconds. Perhaps this routine of yours will make it obsolete since generating all 6033 parts takes less than a tenth of second with the new routine.

Original comment: http://code.google.com/p/sympy/issues/detail?id=3563#c3
Original author: https://code.google.com/u/117933771799683895267/
rathmann commented 11 years ago

A start on some of this is at pull request https://github.com/sympy/sympy/pull/1848 I am attaching an updated multiset_taocp.py, which includes faster implementations of range enumeration and counting.

rathmann commented 10 years ago

PR 1848 is now in (thanks smichr!) so a chunk of this issue is now addressed.

The original post mentioned derangements and involutions. The existing code still uses generate-and-filter for those, so that part is still open.

smichr commented 4 years ago

How does the Takaoka routine here compare with the multiset_permutations in iterables?

rathmann commented 4 years ago

Hi Chris,

Are you the original author of multiset_permutations? That is what github blame seems to be saying, but I don't know how reliable that is. If so, do you remember if you based the implementation off of a published algorithm, or is is something you invented?

As (another?) candidate, algorithm L in Knuth (vol 4 p319, 7.2.1.2) also generates the permutations of a multiset, and I would assume it is reasonably efficient.

(A little too late here to look at these in detail right now.)

rathmann commented 4 years ago

I ran a quick benchmark on the existing multiset_permutations code in sympy, and spent a little while looking at the code.

For strings of the form 'a'n+'b'n, the trend seems slightly worse than linear in the number of permutations enumerated, with a rate of about 130,000 permutations per second in interpreted python 2.7 at n=11.

The version smichr posted to stack overflow seems to be 2-3 times faster, although it is returning lists of ints rather than lists of the original objects.

smichr commented 4 years ago

Are you the original author of multiset_permutations?

I generally attribute sources for code. It has been a while since I worked on this and the origin of it has been lost to me. Looking at it now, I see that it basically handles some special cases (base cases) and then uses recursion.

As far as the one on stack overflow...if it is better recasting those ints to elements shouldn't be too hard.

smichr commented 4 years ago

I was also thinking that we could add enumeration of multiset derangements which are given in terms of an integral of a product of Laguerre polynomials.

A fast method to get a random derangement would be nice too...the next(generate_derangement(long_text)) takes a longgg time.

>>> t = 'SymPy: a CAS in pure Python'
>>> for i in range(2,len(t)):
...  a=time();i,''.join(next(generate_derangements(t[:i]))),time()-a
...
(2, 'yS', 0.0)
(3, 'myS', 0.0)
(4, 'SPym', 0.0)
(5, 'SyyPm', 0.0)
(6, 'P:yySm', 0.0)
(7, ': SyyPm', 0.015599966049194336)
(8, ': SPyyam', 0.031199932098388672)
(9, ':P  ayySm', 0.14039921760559082)
(10, ':C  SPyyam', 1.2479918003082275)
(11, ':A  PCayySm', 12.27712106704712)

Here is a start at this:

def random_derangement(t, choice=None):
    """return a list of elements in which as few as possible are in the
    same position. If the item appearing most often appears less than
    half of the time then all elements will have new positions. To
    produce a pseudorandom derangment, pass a pseudorandom selector
    like `Random(seed).choice`

    Examples
    ========

    >>> t = 'SymPy: a CAS in pure Python'
    >>> ''.join(random_derangement(t))
    't hSaAyoy uepPPry SCn mn: i'
    >>> ''.join(random_derangement(t))
    'mup  SSyyP h:ContneAa  iPyr'

    A predictable result can be obtained by using a pseudorandom
    generator for the choice:

    >>> c = Random(1).choice
    >>> a, b = [''.join(random_derangement(t, c)) for do in range(2)]
    >>> assert a != b
    >>> c = Random(1).choice
    >>> A, B = [''.join(random_derangement(t, c)) for do in range(2)]
    >>> assert a == A and b == B

    """
    if choice is None:
        import secrets
        choice = secrets.choice
    def shuffle(rv):
        '''
        -- To shuffle an array a of n elements (indices 0..n-1):
        for i from n−1 downto 1 do
            j ← random integer such that 0 ≤ j ≤ i
            exchange a[j] and a[i]
        '''
        for i in range(len(rv)-1,0,-1):
            x = choice(rv[:i + 1])
            j = rv.index(x)
            rv[i], rv[j] = rv[j], rv[i]
    def pick(rv, n):
        shuffle(rv)
        return rv[:n]
    ms = multiset(t)
    tot = len(t)
    ms = sorted(ms.items(), key=lambda x: x[1])
  # if there are not enough spaces for the most
  # plentiful element to move to then some of then
  # will have to stay in place
    xs = 2 * ms[-1][1] - tot
    if xs > 0:
        opts = [i for (i, c) in enumerate(t) if c == ms[-1][0]]
        pick(opts, xs)
        stay = sorted(opts[:xs])
        rv = list(t)
        for i in reversed(stay):
            rv.pop(i)
        rv = random_derangement(rv, choice)
        for i in stay:
            rv.insert(i, ms[-1][0])
        return ''.join(rv) if type(t) is str else rv
  # the normal derangement calculated from here
    if len(t) == len(ms):
        rv = list(t)
        shuffle(rv)
        return ''.join(rv) if type(t) is str else rv
    while 1:
        rv = [None] * len(t)
        j = -1  # start with the most numerous, first
        while 1:
            opts = [i for i in range(len(t)) if t[i] != ms[j][0]
                    and not rv[i] is not None]
            if len(opts) < ms[j][1]:
                break
            pick(opts, ms[j][1])
            for i in opts[:ms[j][1]]:
                rv[i] = ms[j][0]
            j -= 1
            if -j > len(ms):
                return ''.join(rv) if type(t) is str else rv

def D(s):
    """return the number of derangements for the elements in set `s`

    Examples
    ========

    >>> from sympy.utilities.iterables import generate_derangments
    >>> list(''.join(i) for i in generate_derangements('sympy'))
    ['pyyms', 'pyysm', 'syymp', 'syypm', 'ymyps', 'ymysp',
    'ysymp', 'ysypm', 'yymps', 'yymsp', 'yypms', 'yypsm']
    >>> assert D('sympy') == len(_)
    >>> D('pure Python CAS')
    207316760352
    >>> D('aaabbb')
    1
    >>> D('abcddd')
    6
    >>> D('abbccc')
    3
    """
    from sympy.abc import x
    from sympy.utilities.iterables import multiset
    ms = multiset(s)
    if len(ms) == len(s):
        return subfactorial(len(s))
    big = max(ms.values())
    if big*2 > len(s):
        return 0
    if big*2 == len(s):
        if len(ms) == 2 and len(set(ms.values())) == 1:
            return 1  # aaabbb
        for k in ms:
            if ms[k] == big:
                break
        t = [i for i in s if i != k]
        if len(t) == len(set(t)):
            return factorial(len(t))  # abc part of abcddd
    return int(abs(integrate(exp(-x)*Mul(*[
        laguerre(i, x) for i in ms.values()]), (x, 0, oo))))

P.S. those Laguerre polynomials are pretty cool. Oh, the things I do not know!

smichr commented 4 years ago

D could be named nD and be added to the sympy/functions/combinatorial/numbers.py where nC, nP, and nT already reside.

rathmann commented 4 years ago

Hi Chris,

Lots of great stuff here; let me make some comments.

General comments on algorithm and existing code.

Comments on @smichr's recent code snippets.

smichr commented 4 years ago
  • but should it then be called a derangement?

maybe there could be a "strict" keyword which, when false, would permit the best possible derangement

smichr commented 4 years ago
  • Bug (I think) in the special cases code for D

Thanks. Now correct (I think) :-)

The secrets.choice will truly return a random element; it is reported to be cryptographically robust. The shuffle routine is the Knuth/Fisher-Yates shuffle which mimics the random shuffle routine but uses the secrets.choice.

smichr commented 4 years ago
  • derangements are OK with a generate-and-test approach, since about 1/e of all permutations are derangements. While this may be technically correct for the set case,

This is why random_derangements can work so well. But even for the set case the performance can be very bad. It takes a looong time to generate the first derangement of string.ascii_letters. Here is a demo with a subset of the same:

>>> a=time();next(generate_derangements(string.ascii_lowercase[:10])),time()-a
(['b', 'a', 'd', 'c', 'f', 'e', 'h', 'g', 'j', 'i'], 0.9399998188018799)
>>> a=time();next(generate_derangements(string.ascii_lowercase[:11])),time()-a
(['b', 'a', 'd', 'c', 'f', 'e', 'h', 'g', 'j', 'k', 'i'], 9.219599962234497)

The time for next(derangements(string.ascii_letters)) -- using all 52 characters -- is only 0.06 seconds for the routine below which (I believe) could be a general replacement for the generate_derangements routine:

def derangements(S):
    """Yield unique derangements of S.

    Examples
    ========

    >>> [''.join(i) for i in derangements('abbcc')]
    ['bccba', 'bccab', 'cacbb', 'ccabb']

    The return value is a list of elements of S which is modified
    internally in place, so a copy of the return value should be
    made if collecting the results in a list (or strings should be
    joined as shown above):

    >>> [i for i in derangements([1,2,3,3])]
    [[3, 3, 2, 1], [3, 3, 2, 1]]
    >>> [i.copy() for i in derangements([1,2,3,3])]
    [[3, 3, 1, 2], [3, 3, 2, 1]]

    Non-hashable items may be passed (and are remapped to integers
    internally):

    >>> [i.copy() for i in derangements([[],(),(),{}])]
    [[(), (), {}, []], [(), (), [], {}]]

    """
  # recast to ints
    if type(S) is not str and not all(type(i) is int for i in S):
      # recast to ints
        from sympy.utilities.iterables import uniq
        s = list(ordered(uniq(S)))
        d = dict(enumerate(s))
        remap = []
        for i, si in enumerate(s):
            remap.extend([i]*S.count(si))
        for di in derangements(remap):
            yield [d[i] for i in di]
        return
  # deal with str or int values for S
    s = set(S)
  # at each position, these are what may be used
    P = [sorted(s - set([k])) for k in S]
  # these are the counts of each element
    C = multiset(S)
  # the index to what we are using at each position
    I = [0]*len(P)
  # the list of return values that will be modified in place
    rv = [None]*len(P)
  # we know the value that occurs most will be located in subsets
  # of the other positions so find those positions...
    mx = max(C.values())
    for M in sorted(C):
        if C[M] == mx:
            break
    ix = [i for i,c in enumerate(S) if c != M]
  # remove M from its current locations...
    for i in ix:
        P[i].remove(M)
  # and make them fixed points each time.
    for fix in subsets(ix, mx):
        p = P.copy()
        for i in fix:
            p[i] = [M]
        while 1:
            Ci = C.copy()
            for k, pk in enumerate(p):
                c = pk[I[k]]
                if Ci[c] == 0:
                  # can't select this at position k
                    break
                else:
                    Ci[c] -= 1
                    rv[k] = c
            else:
                yield(rv)
          # increment last valid index
            I[k] = (I[k] + 1)%len(p[k])
            if I[k] == 0:
              # set all to the right back to 0
                I[k + 1:] = [0]*(len(I) - k - 1)
              # carry to the left
                while k and I[k] == 0:
                    k -= 1
                    I[k] = (I[k] + 1)%len(p[k])
            if k == 0 and I[k] == 0:
                break

This is also referenced here.

asmeurer commented 4 years ago

Also, I am in what sense is the result of random_derangement "random"? One definition would be that if I were to list out all the derangements from generate_derangements(perm), random_derangement would be equally likely to return any of them. Don't know if this is the case, or how to prove it.

Even for random permutations you have to be careful about this. For instance, if the RNG has a period, then n! will quickly grow to be larger than its period, meaning some permutations are impossible to produce (see e.g., https://docs.python.org/3/library/random.html#random.shuffle). If derangements scale like n!/e they will have the same problem. It probably isn't of practical relevance, but it's worth mentioning in the docstring. For secrets randomness I think you avoid this problem, though you then lose the ability to do things like seed the generator.

smichr commented 4 years ago

lose the ability to do things like seed the generator.

In that case you send in random.choice as shown in the docstring.

rathmann commented 4 years ago

In my

in what sense is the result of random_derangement "random"?

comment/question above, I meant that it is quite possible for a procedure to take random bits and still create a biased sample. For example, a procedure might choose between two subtrees with equal probability, even if one of the subtrees has more results than the other. This would over-weight the results in the smaller subtree. I don't understand random_derangement() well enough to know if it does this; I was just mentioning it as something to check. Also, it doesn't mean that the routine is not useful even if it doesn't take a perfectly flat sample, it is just something to be aware of and document.

Looking forward to trying out the new derangements() function.

Also, one thing I noticed about the derangements code in general -- a call to generate_derangements('aabb') will yield the same (single) derangement as generate_derangements('abab')'. The input list is sorted first, and the derangement is then relative to this canonical order. I think this is the right thing to do, but some people might stumble on it.

It is more of an issue for derangements than permutations in general, because derangements don't form a subgroup -- where you start really matters. Might be worth including some discussion of this in the documentation.

smichr commented 4 years ago

The input list is sorted first, and the derangement is then relative to this canonical order.

I'm not sure this is a good idea for generate_derangements since a derangement implies a reference point. Both random_derangement and derangements compute results based on the given order:

from collections import defaultdict
d=defaultdict(int)
for i in range(100):
 d[''.join(random_derangement('baby'))]+=1
print(sorted(d.items()))
[('abyb', 44), ('ybab', 56)]

d=defaultdict(int)
for i in range(100):
 d[''.join(random_derangement('abby'))]+=1
print(sorted(d.items()))
[('bayb', 50), ('byab', 50)]

print([''.join(i) for i in derangements('abby')])
['bayb', 'byab']
print([''.join(i) for i in derangements('baby')])
['abyb', 'ybab']

To recapture the derangments relative to input, generate_derangements would have to keep track of the initial permutation of elements. Perhaps like this:

def permof(s):
    from collections import defaultdict
    from sympy.combinatorics import Permutation
    d = defaultdict(list)
    for i,c in enumerate(s):
        d[c].append(i)
    return Permutation([j for i,j in sorted(zip(
        (d[c].pop(0) for c in sorted(s)),
        range(len(s))))])

>>> from sympy.utilities.iterables import generate_derangements as f
>>> def d(s):
...    p = permof(s)
...    for i in generate_derangements(s): yield p(i)

>>> [''.join(i) for i in d('baby')]
['abyb', 'ybab']
>>> [''.join(i) for i in d('abby')]
['bayb', 'byab']
smichr commented 4 years ago

Looks like something is wrong with the random_derangements.

d=defaultdict(int)
t=[1,2,2,3,3]
for i in range(10000):
    d[tuple(random_derangement(t))]+=1
for k in sorted(d):
    print(d[k],k)

3381 (2, 3, 3, 1, 2)
6619 (3, 1, 3, 2, 2)

>>>  [''.join(i) for i in derangements('12233')]
['23321', '23312', '31322', '33122']

Ah. I had made a change locally that I didn't update here. pick and shuffle have been updated:

1231 (2, 3, 3, 1, 2)
1273 (2, 3, 3, 2, 1)
3706 (3, 1, 3, 2, 2)
3790 (3, 3, 1, 2, 2)

and repeat run

1252 (2, 3, 3, 1, 2)
1231 (2, 3, 3, 2, 1)
3686 (3, 1, 3, 2, 2)
3831 (3, 3, 1, 2, 2)

another

1217 (2, 3, 3, 1, 2)
1235 (2, 3, 3, 2, 1)
3817 (3, 1, 3, 2, 2)
3731 (3, 3, 1, 2, 2)
smichr commented 4 years ago

From the reference in the OP, here are some involution related functions:

def check(n):
    p=Permutation(n)
    i = 0
    while p:
      if all(len(i) < 3 for i in p.cyclic_form):
          i += 1
      p = p.next_lex()
    return i

def involutions(n):
    """generate involutions for set of size n"""
    # from https://hal.inria.fr/hal-00824068/file/704-4975-1-PB.pdf
    from array import array
    if n < 1:
        return
    p = list(range(n + 1))
    F = array('B', [0])*(n + 1)
    F[1] = 1
    def gen(t):
        if t == n:
            yield p
        else:
            for i, inset in enumerate(F):  # i must be in sorted order
                if not inset:
                    continue
                p[i], p[t + 1] = p[t + 1], p[i]
                F[i] = 0
                yield from gen(t + 1)
                F[i] = 1
                p[i], p[t + 1] = p[t + 1], p[i]
            F[t + 1] = 1
            yield from gen(t + 1)
            F[t + 1] = 0
    for y in gen(1):
        yield tuple(y[i] - 1 for i in range(1, len(y)))

# the invo_count and random_involutions are from
# https://stackoverflow.com/questions/38995387/how-to-generate-a-pseudo-random-involution

_invo_cnts = [1, 1, 2, 4, 10, 26, 76, 232, 764, 2620, 9496, 35696, 140152]
def invo_count(n):
    """Return the number of involutions of size n and cache the result."""
    for i in range(len(_invo_cnts), n+1):
        _invo_cnts.append(_invo_cnts[i-1] + (i-1) * _invo_cnts[i-2])
    return _invo_cnts[n]

def random_involution(n, randrange=None):
    """Return a random (uniform) involution of size n.
    An involution is a permutation which, when applied
    to itself, returns the the permutation with all elements
    in place; it is a permutation in which the length of no
    cycle exceeds 2.

    Examples
    ========

    >>> from sympy.combinatorics import Permutation
    >>> import random
    >>> randrange = Random(0).randrange
    >>> random_involution(4, randrange)
    [0, 3, 2, 1]
    >>> p = Permutation(_)
    >>> p
    Permutation(1, 3)
    >>> p*p
    Permutation(3)
    >>> _.size
    4

    """
  # make randrange truly random by default
    if randrange is None:
        from secrets import randbelow
        randrange = lambda n: randbelow(n)
  # Set up main variables:
  # -- the result so far as a list
    v = list(range(n))
  # -- the list of indices of unseen (not yet decided) elements.
  #    unseen[0:do] are unseen/undecided elements, in any order.
    unseen = v[:]
    do = n
  # Make an involution, progressing one or two elements at a time
    while do > 1:  # if only one element remains, it must be fixed
      # decide whether current element (index do-1) is fixed
        if randrange(invo_count(do)) < invo_count(do - 1):
          # Leave the current element as fixed and mark it as seen
            do -= 1
        else:
          # In involution, swap current element with another not yet seen
            i = randrange(do - 1)
            other = unseen[i]
            current = unseen[do - 1]
            v[current], v[other] = v[other], v[current]
          # Mark both elements as seen by removing from start of unseen[]
            unseen[i] = unseen[do - 2]
            do -= 2
    return v

>>> check(6)
232

>>> invo_count(7)  # 7 elements in range(6)
232

>>> for i,j in enumerate(involutions(7)): pass
...
>>> i+1
232
rathmann commented 4 years ago

Good idea to add involutions.

I just looked and saw that my original comment called out involutions. The released code does include involutions, but I see that it still does it by generate and test. Since involutions are a small proportion of permutations (something like sqrt(n!) I think?), this could be a huge speedup.

There is the (quite old) comment in the code for generate_derangements:

TODO: This will be rewritten to use the ECO operator approach once the permutations branch is in master.

Do we think anyone still has an ambition of doing this? If not should we delete the comment? Or at least move it so that it doesn't show up in the documentation?

I am busy this weekend, so won't be able to look at the new routines to a few days.

smichr commented 4 years ago

For anyone looking at the reference for the ECO approach, working translations are located here. There are some subtleties/errors in translation that have been addressed.

The 3-cycle sigma permutation that is applied to pi can be translated like this, too, using SymPy Permutation:

                    j = p[i]
                    s = Permutation(n)(
                        i, t + 1)(
                        j, t + 2)(
                        t + 1, t + 2)
                    p[:] = s(p)
                    for y in gen(t+2):
                        yield y
                    p[:]=(s**-1)(p)
rathmann commented 4 years ago

Hello Chris,

Thanks so much for posting the ECO implementations. They are very interesting, and inspired me to take another crack at the Vajnovszki paper. Since you must have read and understood it to create these implementations, let me start with a few questions/comments on ECO in general, and the paper in particular.

An ECO operator is similar to a generative grammar or production system. In particular, I don't think it would be too hard hard to implement a generic driver function which would take a to-be-defined operator object, a root node, and a depth target n, and run the productions forward to exhaustively enumerate all the objects in the class of size n. The driver would be basically a tree traversal. The implementation could either keep a stack of the state at each level of the traversal, or (and this seems closer to the Vajnovszki style) keep just one combinatorial object, require the operator to define forward and backward transitions, and push the single CO through lots of transitions as the driver traverses the space. Then we could implement a few example operators, (derangements, involutions, ...), and offer the generic facility to our users to specialize with now operators and new classes of combinatorial objects.

However, defining and verifying an ECO operator for a class of objects seems (to me at least) as hard or harder than designing a generation algorithm from scratch. For example, we presumably want multiset derangements and involutions, but I don't know how these would map to an ECO system. So I don't know how much use we would see for a generic ECO facility. Thoughts?

Second, I wasn't sure how tightly Vajnovszki meant to define CAT (constant amortized time). In particular, his implementation of gen_Perm includes a Print(pi) call, which takes time proportional to the length of the permutation, i.e., NOT constant. He also multiplies permutations together, which again is not constant (except for some special cases). For our purposes, I would be fine with "linear in length of output", but it would be nice to get a clearer understanding.

Third, (and this is a more technical issue) I am wondering about the many transformations of the form pi = pi . <i, t+1>. It would seem that it would make more sense to use pi = <i, t+1> . pi. In the second, the transposition operates on permutation pi and swaps two values. (And I notice that the implementations on repl.it use swaps.) Am I misunderstanding? Is this one of the cases where the permutations commute, so that it doesn't matter? Is this a mistake in the paper?

smichr commented 4 years ago

Please note that although I have worked with the combinatorics module and have translated the ECO to the best of my ability, I have very little deep understanding about such matters. One of the difficulties in translation was not being fluent in the notation that was being used and not finding (as I recall) an explanation in the paper. So the only interpretation that made sense to me was the swap for the pi.<i, t+1>. The other challenge was the 3-cycle operation -- knowing what order the cycles were to be applied in.

Note, too, that algorithm 3 requires the elements of the set F to be sorted. The doctest will fail if that is not the case as a permutation can be otherwise repeated. And "1 to t" actually means "1 through t", I believe. (At least that is how I have interpreted it.)

Regarding CAT/print - I translate "Print" loosely to mean "yield". And in terms of permutation multiplication, perhaps (as the translation reflects) since the multiplication can be reduced to a constant number of swaps (1 or 3) the claim still holds. This would be analogous to multiplying by the number 10...010...01 where the ... are some number of zeros. This would only actually require 3 judicious multiplications, not n for an n digit number.

Finally -- and take this with a grain of salt considering my limited background -- I did feel a bit of a disconnect between what seemed to be a general approach (ECO) and the few applications of it. I would have just written the paper as "CAT algorithms for the generation of 6 types of permutations." But perhaps it can be extended to give permutations with other constraints as is mentioned here.

rathmann commented 4 years ago

I have been having fun fooling around with the ECO implementations that you posted. I appreciate that these are faithful to the pseudocode in the paper -- that makes it much easier to follow and understand what Vajnovszki intended.

But, I also wanted to try variations that seemed more like common regular Python code or might be faster.

I am starting with alg1 (permutations) since it is the simplest. Here are some of the things I tried:

This resulted in:

def gen_perm2(t):
    """Starting to hack perm impl

    Required environment (nonlocal variables)
    n2 = length of perms
    p2 == list(range(n + 1))
    """
    for i in range(1, t + 2):
        p2[i], p2[t + 1] = p2[t+1], p2[i]
        if ((t+1) == n2):
            yield p2
        else:
            yield from gen_perm2(t + 1)
        p2[i], p2[t + 1] = p2[t + 1], p2[i]

On a much more drastic level, I wanted to see if I could get rid of the yield from statements. These mean that a permutation of length 10 is going though something like 10 levels of yield, which seems to violate the desired CAT properties.

A recursive function can usually be mechanically rewritten into an iterative version, by storing state on an explicit stack and replacing the calls and returns with GOTOs. Python lacks a GOTO so I had to think a bit, but here is the result:

def eco_cleaner(plist):
    """The eco routines yield a list which gets modified for every
    permutation.  Also, the numbers are 1-based, and only those
    starting at index 1 are part of the actual permutation.
    This function remedies these issues.

    usage example:  [eco_cleaner(perm) for perm in alg1_iter(4)]
    """
    return tuple(i-1 for i in plist[1:])

def alg1_iter(n):
    """Yield length-n permutations.  Result is a snapshot of internal list
    and needs post processing.  (See eco_cleaner().)
    """

    # This is a regular (leaf) function so can use local variables for
    # state.

    p = list(range(n + 1))
    t = 1

    # Changes needed to accommodate simulation of recursion with GOTOs
    # i gives state of iteration at a level of the tree.  Hence, it
    # needs to be an array (stack), to save values at different levels.
    i = [0]*(n+2)  

    # Boolean flag to work around lack of GOTO.  If returning is true,
    # there is more work to do on this level, and do NOT reinitialize
    # i[t]
    returning = False 

    while True :

        if not returning:
            # Got to here on a call, not return from call
            i[t] = 1

        if (i[t] < t+2):

            p[i[t]], p[t + 1] = p[t+1], p[i[t]]
            if ((t+1) == n):
                yield p
                returning = True
                p[i[t]], p[t + 1] = p[t + 1], p[i[t]]
                i[t] += 1
                continue   # GOTO equivalent
            else:
                t=t+1
                # Former recursion
                returning = False
                continue   # GOTO equivalent
        else:
            # Have finished all nodes at this level,
            # need to go back up.  This is the housekeeping
            # of the returns of recursive version
            t = t-1
            if t < 1:
                return # Was already at top of stack, done overall.
            p[i[t]], p[t + 1] = p[t+1], p[i[t]]
            i[t] += 1
            returning = True
            #Back to top of loop

This is admittedly kind of ugly. Mostly, I think because I was trying to do a translation of what was there, rather than a from-scratch iterative traversal.

It is only a bit faster with regular python3, but much faster (factor of 5) with pypy3.

Here is a simple timing snippet:

    n=10
    print("Checking timings.  n=", n)

    c1 = perf_counter()
    count1 = sum(1 for _discard in alg1(n))
    c2 = perf_counter()

    count2 = sum(1 for _discard in (eco_cleaner(pl) for pl in alg1_iter(n)))

    c3 = perf_counter()

    print("Enumerated", count1, count2, "permutations of length", n)
    print("Elapsed recursive:", c2-c1)
    print("Elapsed iterative:", c3-c2)
rathmann commented 4 years ago

The larger question, of course, is what do we want to do with SymPy.

At a minimum, I think the existing routines for involutions and derangements should be replaced with faster versions.

We could adapt the ECO versions for the set case, or pull something from Knuth.

Hacking an iterative version of alg1 was just for my learning -- we already have very good permutations implementations available. I do think I could make iterative versions of the other algorithms, if that turns out to be desirable.

Chris also provides a number of interesting algorithms in other comments on this issue.

The literature on this area is large. I was looking through the "related work" section of a recent paper on permutation languages and was struck by the sentence:

Pattern avoidance in permutations is a central topic in combinatorics, as illustrated by the books [Kit11, Bón12], and by the conference ‘Permutation Patterns’, held annually since 2003.

Given the richness of available results, the question is what tiny fraction should we include in SymPy.

rathmann commented 4 years ago

Edit -- I now have an idea why he used F. See comment of 18 Feb 2020.


Another question -- any idea why Vajnovszki used a supplementary set F in Algorithm 3 (involutions)? An index i is in the fixed point iff p[i]==i, so it is just as easy to check directly.

Here is a modified version, which seems to provide the same output as alg3

def alg3_nof(n):
    """
    generate length-n involutions by fixed-points

    >>> list(alg3(3))
    [(1, 0, 2), (2, 1, 0), (0, 2, 1), (0, 1, 2)]
    >>>> [len(list(alg3d(i))) for i in range(1, 7)]
    [1, 2, 4, 10, 26, 76]

    """
    '''
    Hack to get rid of F.  index is in fixed point iff p[i]==i
    '''
    if n < 1:
        return
    p = list(range(n + 1))
    def gen(t):
        if t == n:
            yield p
        else:
            for i in range(1, t+1):
                if not p[i] == i:
                    continue
                p[i], p[t + 1] = p[t + 1], p[i]
                yield from gen(t + 1)
                p[i], p[t + 1] = p[t + 1], p[i]
            yield from gen(t + 1)
    for y in gen(1):
        yield tuple(y[i] - 1 for i in range(1, len(y)))

As before, thanks for not making such a modification on your repl.it version. Differences from the paper would just have been confusing.

smichr commented 4 years ago

Interesting notes and modifications that you have made. I assume the comment you had about yield from applies to the (equivalent?) structure

for i in iter:
    yield i  # instead of `yield from iter`

Good catch, I think, about F.

As for SymPy, I think the ability to generate a random derangement quickly is good and to have "good enough" building blocks for other generators (as given in ECO).

smichr commented 4 years ago

if the RNG has a period, then n! will quickly grow to be larger than its period, meaning some permutations are impossible to produce

rereading this I see what you are getting at. So it's not only a matter of not seeing a permutation because there isn't time (given the colossal number of possible permutations for a large set); it's a matter of nobody ever seeing certain permutations because the generator is incapable of showing certain sequences.

rathmann commented 4 years ago

I am not an expert, but yes, I was thinking of yield from as a more convenient equivalent to the loop that you need to use in 2.7. All the classes we are enumerating are exponential in the size of the object, so depths much bigger than 10 are perhaps unlikely. But still, if every object has to percolate up through 10 levels of yield from or yield statements, that is going to add some overhead.

rathmann commented 4 years ago

I looked again at the paper, and I have a guess as to why Vajnovszki used the auxiliary set F in algorithm 3 (involutions by fixed points). If the fixpoints have a low density within the permutations, you are potentially wasting a lot of time skipping over the non-fixpoint elements of p -- perhaps enough that a test-for-each-i variation is not CAT.

If F can be implemented so that sorted traversal, insert and remove elements are all constant time per element, then the algorithm is CAT overall. Vajnovszki doesn't give such an implementation of F, although I imagine it could be done with a (sorted) doubly linked list. Knuth's dancing links algorithm uses this trick extensively.

A Python implementation of a doubly linked list would take some effort and add overhead. As a practical matter, even if it is not CAT in theory, it might be that just checking p[i]==i will be fast enough.

rathmann commented 4 years ago

I think I now have a more reasonable iteration-based ECO generator.

I implement a given ECO operator as a generator which yields the children of a particular permutation. The function eco_traverse takes such an operator and traverses the tree of permutations to the desired depth. The traverser maintains the traversal state as a stack of (in progress) generators.

So far my traverser works only with the "easy" operators (where I define easy as those ECO rules which only have successors at level t+1). But, I think extending it to the other operators will not be too bad.

Here is the (still rough) implementation:

def perm_operator(p, t):
    """Yields a sequence of permutations which are ECO successors of
    length t permutation p.  The successors are of size t+1.  

    Data structure: p is a list of integers.  p[0] contains 0, and is
    not used.  p[1]..p[t+1] is the permutation of ints 1..t+1.  The
    list may contain elements beyond t+1, but these are not
    conceptually part of the permutation in question.  These should
    always preserve the invariant that p[k]==k for k>t+1.

    The list p has object semantics.  Many operators have access to
    and and may be modifying p.  Because of the way ECO is set up, by
    the time this operator is returned to, via next(), all of the
    changes will have been reversed, and it will be as if this
    generator had a private copy.
    """
    for i in range(1, t+1):
          p[i], p[t + 1] = p[t+1], p[i]
          yield p
          p[i], p[t + 1] = p[t+1], p[i]

    # The last swap <t+1, t+1>, is a no op, so omit as a minor
    # optimization/special case.
    yield p

def cyclic_p_operator(p, t):
    """Operator to produce permutations with a single cycle. Would be alg1b.

    Derived from regular permutation operator by omitting the child
    which does not swap the (t+1)st element.
    """
    for i in range(1, t+1):
        p[i], p[t + 1] = p[t+1], p[i]
        yield p
        p[i], p[t + 1] = p[t+1], p[i]
    # Omit the last yield here.
    # yield p

def involution_p_operator(p, t):
    """ECO operator which yields only involutions, i.e., p*p == identity.

    Implementation not CAT, since it scans through all t+1 elements of p
    and only expands some of them.
    """
    for i in range(1, t+1):
        if p[i] == i:
            p[i], p[t + 1] = p[t+1], p[i]
            yield p
            p[i], p[t + 1] = p[t+1], p[i]
    yield p

def eco_traverse(e_op, n):
    """Iteratively traverse the tree generated by ECO operator e_op.

    Maintains traversal state in a stack of generators.  
    """
    p = list(range(n + 1))

    if n==1:
        yield p   # argument p should be in the set.
        return
    istack = [0] * (n+2)

    t=1     
    istack[t] = e_op(p, t) # Root from which the traversal expands

    while True:
        try:
            if t==(n-1):
                for p in istack[t]:
                    yield(p)
                raise StopIteration
            else:
                p=next(istack[t])
                t=t+1
                istack[t] = e_op(p, t)
        except StopIteration:
            t = t-1
            if t<1:
                return

Performance is slower than alg1_iter(), but not too bad. For example, on my laptop, with pypy3 it enumerates the 3628800 length 10 permutations in about about 0.255 seconds. Adding the 0/1 fixup increases this about 1.8 seconds. (About 7x longer.) However, I assume the fixup code is just to make the source look like the paper. It shouldn't be hard to rework the routines to generate in Python conventions, and without the leading 0.

rathmann commented 4 years ago

A few programming notes:

  1. For the less easy ECO operators, I need a way to explicitly store the length of the permutation with the permutation itself. Otherwise eco_traverse can't use the yielded permutation to initialize the next level generator, and doesn't know whether to put that generator in the t+1 or t+2 slot. Some options include:
    • Defining a permutation class with members for the permutation list and for the length
    • Use the unused 0th element of the list as the active length. That would be the C approach, but too ugly for us I think, especially since I hope to get rid of the 0/1 mismatch anyway.
    • Have a 2 element list, pp, with the convention that pp[0]==length and pp[1]==p (old list for perm). This is what I am leaning towards.
  2. What made alg1_iter() more difficult to write and understand is that structured programming discourages (and Python forbids) having a GOTO into the middle of a loop. However, if you are simulating a function call/return using a stack, the call is a GOTO back to the start of the function, and the return is a GOTO back to just after the "call". The GOTO back to the start can be simulated with a break or continue, but the return is more difficult, especially if the operator has multiple recursive calls. The stack-of-generators approach seems to paper over this issue.
  3. The operator/traverser design makes it reasonably easy to experiment with alternate implementations for the operators. For example, I noticed that the permutation operator does two swaps between yields, but that only three memory locations are touched. So, these can be rewritten as a 3-element rotate. This seems to work fine, and might be a bit faster, but the run-to-run variability is big enough it is not definitive.
def perm_operator2(p, t):
    """Same semantics as perm_operator. Testing whether it is worthwhile
    to optimize the swaps. (Not expecting a huge effect.)

    Result: Sometimes faster, sometimes slower.
    Test of eco_traverse only, i.e.:
         sum(1 for _discard in eco_traverse(perm_operator2, n))
    """
    if t < 3:
        for i in range(1, t+1):
              p[i], p[t + 1] = p[t+1], p[i]
              yield p
              p[i], p[t + 1] = p[t+1], p[i]

        # The last swap <t+1, t+1>, is a no op, so omit as a minor
        # optimization/special case.
        yield p
    else:
        p[1], p[t+1] = p[t+1], p[1]
        yield p
        for i in range(2, t+1):
            # Replace a pair of swaps with equivalent three way assigment 
            # p[i-1], p[t + 1] = p[t+1], p[i-1]
            # p[i], p[t + 1] = p[t+1], p[i]
            p[i-1], p[i], p[t+1] = p[t+1], p[i-1], p[i]
            yield p
        p[t], p[t + 1] = p[t+1], p[t]
        yield p
rathmann commented 4 years ago

I now have iterative implementations for all the algorithms in the Vajnovszki paper except Bell permutations (alg2). I skipped this one mostly because we already have a reasonable-looking Bell permutations implementation in SymPy.

For the jumping operators, as I mentioned before, I needed a permutation representation with an explicit length field. I chose the nested-list alternative, although it wouldn't be too hard to swap in an equivalent representation.

So, pp is a list of lists. pp==[2, [0,2,1,3,4]] corresponds to the permutation [2,1] and pp==[3, [0,2,1,3,4]] corresponds to [2,1,3].

I made a number of changes from my previous traverser. In particular, the caller needs to pass in the root node explicitly (since some ECO trees start at 0, and some at 1), and the depth of the stack is just depth of recursive calls -- it does not correspond to the length of the current permutation.

Here is code for the traverser, and operators corresponding to algorithms 4,5 and 6. I think the traverser is even a bit clearer than before, since it doesn't handle the depth explicitly. The operators are nearly mechanical adaptations of alg{4,5,6} from the repl.it versions.

def eco_traverse_jump(e_op, n, root_perm=None):
    """Iteratively traverse the tree generated by ECO operator e_op.

    Maintains traversal state in a stack of generators.  
    """
    pp = root_perm
    p = root_perm[1]

    if n==pp[0]:
        yield p # Caller asked for the root node
        return
    opstack = [None] * (n+2)  # Stack of generators for in-progress traversal
    sp = 0  # stack pointer, start at 0

    # Do not attempt to tie position on the stack to size of object,
    # since a node can have a child of size t+1 or t+2.

    opstack[sp] = e_op(pp, n) # Root from which the traversal expands

    while True:
        try:
            pp = next(opstack[sp])
            if pp[0] == n:
                yield(p)
            else:
                sp += 1
                opstack[sp] = e_op(pp, n)
        except StopIteration:
            sp -=1
            if sp < 0:
                return

#
# Now the operators that may generate children at either levels t+1 or t+2
# These are (nearly mechanical) adaptations of alg{4,5,6}

def involution_r_operator(pp, n):
    """operator for length-n involutions by recurrence
    yields child permutations of length t+1 and t+2.

    Corresponds to algorithm 4.

    Intuitively, this expands the argument involution three ways.
    1) Add element t+1 to the end as a new fixpoint
    2) Add two elements t+1 and t+2, and for each i in the argument:
       a) If i is a fixpoint, put it in a cycle (swap it with) t+2 only

       b) if i is in a cycle with j, break the i,j pair, and put i in
          a cycle with t+1, and j with t+2.  (And also i,t+2 and
          j,t+1).   XXX - double check that this is accurate.
    """
    p = pp[1]
    t = pp[0]
    pp[0] = t+1
    yield pp
    if t <= n - 2:
        pp[0] = t+2
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        yield pp
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        for i in range(1, t + 1):
            j = p[i]
            if j == i:
                p[i], p[t + 2] = p[t + 2], p[i]
                yield pp
                p[i], p[t + 2] = p[t + 2], p[i]
            else:
                p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
                p[j], p[t + 2] = p[t + 2], p[j]
                p[i], p[t + 1] = p[t + 1], p[i]
                yield pp
                p[i], p[t + 1] = p[t + 1], p[i]
                p[j], p[t + 2] = p[t + 2], p[j]
                p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
    pp[0] = t # At end, operator should return pp to original state

def pure_involution_operator(pp, n):
    """Yields the pure involution children of pp in the ECO traversal."""
    p = pp[1]
    t = pp[0]
    if t <= n - 2:
        pp[0] = t+2
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        yield pp
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        for i in range(1, t + 1):
            j = p[i]
            p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
            p[j], p[t + 2] = p[t + 2], p[j]
            p[i], p[t + 1] = p[t + 1], p[i]
            yield pp
            p[i], p[t + 1] = p[t + 1], p[i]
            p[j], p[t + 2] = p[t + 2], p[j]
            p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
    pp[0] = t # At end, operator should return pp to original state

def derangement_operator(pp, n):
    """Jumping ECO operator for derangements.  Needs eco_traverse_jump"""
    p = pp[1]
    t = pp[0]

    pp[0] = t+1
    for i in range(1, t + 1):
        p[i], p[t + 1] = p[t + 1], p[i]
        yield pp
        p[i], p[t + 1] = p[t + 1], p[i]
    if t <= n - 2:
        pp[0] = t+2
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        yield pp
        p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
        for i in range(1, t + 1):
            j = p[i]
            p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
            p[j], p[t + 2] = p[t + 2], p[j]
            p[i], p[t + 1] = p[t + 1], p[i]
            yield pp
            p[i], p[t + 1] = p[t + 1], p[i]
            p[j], p[t + 2] = p[t + 2], p[j]
            p[t + 1], p[t + 2] = p[t + 2], p[t + 1]
    pp[0] = t # Restore state

And here is code for an example wrapper function that we might (after some cleanup) expose to users.

def involutions_jump(n, fix0 = True):
    """Yield involutions, that is, permutations p such that p*p=identity.

    Corresponds to algorithm 4.

    If fix0 is true, the results yielded are tuples, which have been
    converted to Python 0-based conventions.  If false, the results
    are successive snapshots of the same list which is modified in the
    course of the algorithm.  The client must be sure to copy or
    otherwise use the results immediately.  XXX - put such verbiage in
    comments, but avoid being too repetitive.
    """

    pp = [0, list(range(n+1))]
    raw_iterator = eco_traverse_jump(involution_r_operator, n, pp)
    # This function returns a generator, rather than being a generator itself.
    # Pretty much the same thing for the caller.
    if fix0:
        return (tuple(i-1 for i in p[1:]) for p in raw_iterator)
    else:
        return raw_iterator

Note that the function returns an iterator, rather than being an iterator itself. I don't know if there is anything objectionable about this on Python style grounds. It does avoid a level of return from, which was my motivation.

I did a little performance testing, and these are still significantly faster than the recursive versions, particularly with pypy3. (I suspect that the CPython interpreter optimizes yield from in ways that don't carry over to pypy.)

Involutions, for example, gets through the 46206736 length-16 involutions in about 7 seconds, for a rate of about 6 million involution permutations per second. The algorithm 3 and algorithm 4 versions are pretty close in speed, with perhaps a slight advantage for algorithm 3. This is without the copy and 0/1 adjustment. As I mentioned before, the cleanup step slows the process down substantially.

smichr commented 4 years ago

It's fun to read about what you are doing. I don't have as good of a "feel" for what is going on. The closest I got to that was in studying the bell-ringer algorithm and trying to make a game of it with students to see how well we could remember the steps.

rathmann commented 4 years ago

Trying to get back to this issue after a hiatis. I hope everyone monitoring this issue is still safe and healthy.

I read back through the discussion since January 4, and am struck by how many useful algorithms and implementations we now have. I think including at least some of these would be a nice extension and/or performance improvement for SymPy.

To my mind, the immediate issues are:

  1. Which new functions should we add (and which existing functions get improved implementations)?

  2. What general guidelines should we follow for the API?

  3. Who will be doing the enhancement(s), and in how many pull requests.

I have suggestions on all the the above, and and will give them below, just as a starting point for discussion.

For number 1, I like the ECO implementation we now have for derangements, involutions, etc. Probably would skip bell permutations (since there is already an implmentation in the existing code) and put in only one implementation of involutions.

I also think that multiset derangements would greatly benefit from an improved implementation. @smichr posted an interesting version on Jan 23, 2020. I have done basic testing, and it always worked. I can't say that I understand it, though. The code is well commented, but I would benefit from a basic description of the high-level approach.

I also have a multiset derangements implementation which I have been playing with. It uses a straightforward (in my opinion) approach, and is pretty fast, especially with PyPy. I will describe it in another posting.

Smichr also gave implementations for random involutions and derangements. Those would certainly be interesting to add. I also really like the code for counting multiset derangements (Jan 12 posting). It would be nice to see this added. I am assuming that, due to a need to avoid import cycles, it shouldn't go into utilities.

For number 2, the API is a little tricky, since the existing code is not entirely consistent. In particular, some routines take only one representation of the desired set (e.g., generate_involutions), while others, e.g., multiset_permutations, take either a sequence or a multiset (dict).

I am assuming that backward compatibility versions outweighs any desire for consistency, and that we should not attempt to impose uniformity on the API. My guess would be the best would be to go with a more permissive/flexible interface for any new functionality, but if possible, to have the special case input handling centralized in one routine.

Another API issue is 0 or 1 based indices. I assume that the default would be that the user passes in a sequence, and we return tuples or lists of those sequence elements. But, we can also provide a (much faster) raw mode which just yields snapshots of an internal data structure, which is generally a list of indices. Currently, the ECO routines are 1-based. Worth trying to rework these to be 0-based for the raw mode?

For number 3, I would be happy to create a pull request with the ECO implementations. I could add the CatMultiset-based derangements code, if we decide to go with that version.

Chris, would you like to put in the random derangement/involution functions and multiset derangement counting function?

rathmann commented 4 years ago

First posting an overview of functions and classes for my implementation of multiset derangements. Overall, the technique is reasonably straightforward, but there is a fair amount of code.

The implementation is spread over multiple classes, and there are multiple versions of some functions. This has allowed for mixing-and-matching and easier evaluation. Deployment, if it is to happen at all, would presumably be much more streamlined.

Base utility class is CatMultiSet, with MultiElement as a helper struct. CatMultiset is just a doubly linked list, and uses the unlink/relink operations from Knuth's dancing links to implement delete/insert in Constant Amortized Time (CAT). This only works if the link/unlink operations match like balanced parentheses in an expression. Since CatMultiset is used to support a backtracking search, it all works. Finding the next element still in the set is also constant. CatMultiSet could also have been used for the F data structure for ECO involutions. (Postings of Feb 11 and 18.)

I was surprised not to see a general dancing links style backtracking implementation already in SymPy. I suppose that search is less critical to computer algebra than I might have guessed.

Built on top of CatMultiSet are a couple of implementations of multiset permutations, multiset_permutations_impl_r and multiset_permutations_impl. The "impl" means that I expect that these would be wrapped by a higher level routine that would allow for multiple input formats, and (optionally) remap the results into tuples or lists of the provided elements.

The multiset_permutations_impl_r function is performing pretty much the same recursion as the last few lines of multiset_permutations() in iterables.py. Then, multiset_permutations_impl() is just an iterative version of the recursive multiset_permutations_impl_r(). The iterative version is somewhat faster than the recursive version with interpreted Python3, and much faster with pypy. (I am assuming we can now freely use Python 3 forms like yield from?)

I have not yet tested the performance of these relative to the Takaoka routine that smichr posted on Jan 4.

The main purpose of the above two functions is as a warmup to the predicate-taking versions multiset_permutations_pred_impl_r(), and multiset_permutations_pred_impl(). The basic idea for these came from Knuth's algorithm X, TAOCP, 7.2.1.2 volume 4a, page 334 (although I didn't use his code). Since the permutation is build up left-to-right, the predicate returns False if it can determine that no permutation with the given prefix can be a member of the desired class. I only use this for a test for derangements, although Knuth also mentions it as a way to solve alphametics puzzle problems.

I made two versions of the derangements predicate. SimpleDerangementPred just tests if the value about to be placed would itself violate the derangement constraint. PigeonDerangementPred is more complicated and attempts to be sharp -- it should True iff there is at least one derangement which has the given prefix. I was thinking of the pigeonhole principle when I wrote it, but now when I look at I think the analogy might be a bit of a stretch.

At any rate, on a test case of seq = '1112234444445555', multiset_permutations_pred_impl() generates the 634810 derangements in about 2 seconds with pypy on a raspberry pi, and about .6 seconds on a more conventional computer.

rathmann commented 4 years ago

And here is the code. This snippet should run as is with Python 3 or pypy3.

import time   # for test function at bottom
from sympy.utilities.iterables import multiset
from sympy import ordered

class MultiElement(object):
    """Used only with CatMultiSet. Contains the element itself, a count
    for how many times the element appears in the multiset, and left,
    right pointers since CatMultiSet uses a doubly linked list.
    """

    __slots__ = ['element',  # The element itself.
                 'count',  # multiplicity of the element
                 'left',  # left pointer (index)
                 'right'  # right index
                 ]

    def __init__(self, e, ct):
        self.element = e
        self.count = ct

class CatMultiSet(object):
    """Implements a multiset such that some of the operations are
    implemented in constant amortized time (CAT).
    """
    def __init__(self):
        pass

    def init_from_seq(self, seq):
        return self.init_from_m(multiset(seq))

    def init_from_m(self, m):
        g = [(k, m[k]) for k in ordered(m)]
        return self.init_from_g(g)

    def init_from_g(self, g):
        """Initialize from a list of tuples of the form (e, ct) where ct
        is the multiplicity of element e"""

        # create a header element for the linked list and store it at
        # position 0.
        m = MultiElement(None, 0)
        self.s = [m]
        i = 1
        for e, ct in g:
            m = MultiElement(e, ct)
            self.s.append(m)
        # Now initialize the l,r pointers to connect the linked list.
        sz = len(self.s)
        for i in range(len(self.s)):
            self.s[i].left = (i-1) % sz
            self.s[i].right = (i+1) % sz

    def decrement_element(self, i):
        """Decrement the multiplicity of MultiElement at position i.
        If the multiplicity is now 0, unlink it from the list.

        decrement and increment operations are assumed to take place
        with a stack discipline.  XXX explain """

        m = self.s[i] # The node to be decremented, and perhaps unlinked
        m.count -= 1

        if m.count == 0:
            # Unlink the element.  This is the same operation used in
            # the Dancing Links algorithm.
            ml = self.s[m.left]  # Node to left of m
            mr = self.s[m.right]  # Node to right of m
            ml.right = m.right
            mr.left = m.left

    def increment_element(self, i):
        """Undo operation for decrement_element."""
        m = self.s[i]
        if m.count == 0:
             # Relink the element.  Also from Dancing Links
            ml = self.s[m.left]  # Node to left of m
            mr = self.s[m.right]  # Node to right of m
            ml.right = i
            mr.left = i
        m.count += 1

    # Convenience methods
    def total_multiplicity(self):
        return sum(m.count for m in self.s[1:])

    def initial_permutation(self):
        """returns permutation lowest in lexicographic order.

        Result is a list of indices."""
        base_perm = []
        j=self.first()
        while j != 0:
            base_perm.extend([j]*self.s[j].count)
            j = self.next(j)
        return base_perm

    # Constant time (per step) iteration through the set.
    def first(self):
        """return index of the first element currently in the multiset"""
        return self.s[0].right

    def next(self, i):
        """returns index of the element in the multiset to the right of i"""
        return self.s[i].right

class SimpleDerangementPred(object):
    """Test whether any permutation which would be a child of this state
    of the traversal could be a derangement.

    If pred() method returns False, traversal can prune the subtree.
    """

    def __init__(self, cm):
        self.size = cm.total_multiplicity()

        # Base perm is the reference point for all the derangements.
        # For permuation p to be a derangement, for all i,
        # p[i] != base_perm[i]
        self.base_perm = cm.initial_permutation()

    def pred(self, p, cm, j, i):
        """return False if placing element j in position i would result in
        p NOT being a derangement.

        In general, this test prunes some, but not all, of the bad
        branches from the traveral.  If cm is a set, this test is
        sufficient.
        """
        return (self.base_perm[i] != j)

class PigeonDerangementPred(object):
    """Provides a (more advanced) test for whether the traversal of the
    tree of permutations should be pruned at a particular permutation.

    Using this version may or may not speed up the overall traversal.
    This predicate will require fewer permutations to be explored, but
    the test itself is more costly.

    In tests so far, this predicate is sufficient to prune all the bad
    branches from the traversal.  I have not verified that it will
    always work.
    """

    def __init__(self, cm):
        self.size = cm.total_multiplicity()

        # Cache some data which will be useful for evaluating the
        # predicate.  See the definition of pred and surrounding
        # comments for how they are used.

        # Cache just the counts from the (original) multiset
        self.counts = [m.count for m in cm.s]

        # Base perm is the reference point for all the derangements.
        # For permuation p to be a derangement, for all i,
        # p[i] != base_perm[i]
        self.base_perm = cm.initial_permutation()

        self.extra_prune = 0  # DEBUG instrumentation.  Remove later

        # Semantics: If base_perm[i] == j, then right_counts[i] is the
        # number of j values in base perm which are to the right of
        # position i.  Can be 0.  Purpose is to give a quick way to
        # calculate how many of the slots to the right of i are taken
        # up by j values.
        self.right_counts = []
        j=cm.first()
        while j != 0:
            ct = cm.s[j].count # number of j's in multiset
            self.right_counts.extend(reversed(range(ct)))
            j = cm.next(j)

    def pred(self, p, cm, j, i):
        """return False if placing element j in position i would result in
        p having a prefix that could not be completed to
        a derangement.

        Current implementation is NOT CAT.  Takes time proportional to
        the number of multielements in cm at this stage of the
        traversal.  Probably there could be a version which reduces
        this overhead, but current version is *reasonably*
        straightforward.

        XXX explain basic constraints.
        """

        if (self.base_perm[i] == j):  # First do the simple test.
            return False
        else:
            # Test is to check each MultiElement of cm and see if
            # there are enough slots left open that do not have a
            # value of j in base_perm
            j2 = cm.first()
            k = self.base_perm[i]
            # slots_avail is the number of slots in the permutation to
            # the right of position i.
            slots_avail = self.size - i - 1
            while j2 != 0:
                if (j2<k) or j2 == j:
                    pass # These elements can't be problematic.
                         # XXX - should explain why
                elif j2 == k:
                    # ct_j2_occupied is the number of slots to the right of i
                    # which are taken up by j2 values.
                    ct_j2_occupied =  self.right_counts[i]
                    # ct_j2_needed is number of j2 values to place
                    ct_j2_needed = cm.s[j2].count
                    if  ct_j2_occupied + ct_j2_needed > slots_avail:
                        # Fail based on pigeonhole principle.  There
                        # are more values of j2 than there are (non
                        # derangement violating) places to put them.
                        self.extra_prune += 1
                        return False
                else:  # j2>k, matches elif
                    # All the j2 values in base_pred are to the right of i
                    ct_j2_occupied = self.counts[j2]
                    ct_j2_needed = cm.s[j2].count
                    if  ct_j2_occupied + ct_j2_needed > slots_avail:
                        self.extra_prune += 1
                        return False

                j2 = cm.next(j2)
            return True

def multiset_permutations_impl_r(cm, p=None, i=0):
    """Yields the permutations of multiset cm as successive snapshots

    End users will likely want to call a higher-level function that
    wraps this implementation.

    Argument cm must be a CatMultiSet object.

    This version is recursive, and follows closely the structure of
    the code in sympy up to at least version 1.6.  It uses different
    data structures, and is meant to be an intermediate point on the
    way to an iterative version.
    """

    if not p:
        p = [0] * cm.total_multiplicity()  # List for the permutation(s).

    # Variables: i indexes into the permutation, while j indexes into
    # the multiset.  The resulting permutations are lists of such j
    # indexes.
    j=cm.first()
    while j != 0:
        p[i] = j
        cm.decrement_element(j)
        if i == len(p) - 1:
            # Complete permutation
            yield p
        else:
            yield from multiset_permutations_impl_r(cm, p, i+1)
        cm.increment_element(j)
        j = cm.next(j)

def multiset_permutations_impl(cm):
    """Yields the permutations of multiset cm as successive snapshots

    Argument cm must be a CatMultiSet object. Permutations are yielded
    as successive snapshots of a list.

    This traverses the tree of permutations in the same order as that
    of the existing sympy implementation.  Uses CatMultiSet to avoid
    some data copies, and uses iteration in place of recursion.
    """

    # Usually transforming an algorithm from recursive to iterative
    # requires provisioning a stack of some kind.  In this case, the
    # permutation p, stores sufficient state that no such stack is
    # needed.

    p = [0] * cm.total_multiplicity()
    i = 0 # Position within permutation
    j = cm.first()  # indexes into cm also value of p[i]  XXX

    # Invariant: The union of the filled in part of the permutation
    # with the current state of cm should be the original multiset.
    # Hence when we add an element of to the permutation with p[i]=j,
    # we have to remove it from cm with a call to
    # cm.decrement_element(j).

    while True:
        if (j != 0):
            # There are still values in the multiset to explore for
            # this level of the permutation, so fill in j
            p[i] = j
            cm.decrement_element(j)
            if i == len(p) - 1:
                yield p    # Complete permutation
                cm.increment_element(j)
                j = cm.next(j)
            else:
                # Still in interior of the permutation, continue to next longer
                # (Equivalent of recursive call at i+1)
                i = i+1
                j = cm.first()
                continue
        else: # j==0, meaning no more elements to explore at this level.
            if i==0:
                return  # Already at top level, done overall.
            # Recursive return / backtrack.   Go back a level, and put old
            # last value of permutation, p[i], back into cm

            i = i-1
            j = p[i]
            cm.increment_element(j)
            j = cm.next(j)

def  multiset_permutations_pred_impl_r(cm, ptest, p=None, i=0):

    """Return filtered subset of multiset permutations in lexicographic
    order.
    ptest is a predicate class which has a .pred() method of XXX signature.
    So far this is used for derangements only.
    XXX - mention Algorithm X, TAOCP,  7.2.1.2
    """
    if not p:
        p = [0] * cm.total_multiplicity()  # List for the permutation(s).

    # Variables: i indexes into the permutation, while j indexes into
    # the multiset.  The resulting permutations are lists of such j
    # indexes.
    j=cm.first()
    while j != 0:
        if ptest.pred(p, cm, j, i):
            p[i] = j
            cm.decrement_element(j)
            if i == len(p) - 1:
                # Complete permutation
                yield p
            else:
                yield from multiset_permutations_pred_impl_r(cm, ptest, p, i+1)
            cm.increment_element(j)
        j = cm.next(j)

def multiset_permutations_pred_impl(cm, ptest):
    """Yields filtered subset of permutations of multiset cm

    Argument cm must be a CatMultiSet object. Permutations are yielded
    as successive snapshots of a list.

    This traverses the tree of permutations in the same order as that
    of the existing sympy implementation.  Uses CatMultiSet to avoid
    some data copies, and uses iteration in place of recursion.
    """

    # Usually transforming an algorithm from recursive to iterative
    # requires provisioning a stack of some kind.  In this case, the
    # permutation p, stores sufficient state that no such stack is
    # needed.

    p = [0] * cm.total_multiplicity()
    i = 0 # Position within permutation
    j = cm.first()  # indexes into cm also value of p[i]  XXX

    # Invariant: The union of the filled in part of the permutation
    # with the current state of cm should be the original multiset.
    # Hence when we add an element of to the permutation with p[i]=j,
    # we have to remove it from cm with a call to
    # cm.decrement_element(j).

    while True:
        if (j != 0):
            # There are still values in the multiset to explore for
            # this level of the permutation, so fill in j, but only if
            # we pass the predicate
            if ptest.pred(p, cm, j, i):
                p[i] = j
                cm.decrement_element(j)
                if i == len(p) - 1:
                    yield p    # Complete permutation
                    cm.increment_element(j)
                    j = cm.next(j)
                else:
                    # Still in interior of the permutation, continue
                    # to next longer (Equivalent of recursive call at
                    # i+1)
                    i = i+1
                    j = cm.first()
                    continue
            else: # failed ptest, skipping branch
                j = cm.next(j)
        else: # j==0, meaning no more elements to explore at this level.
            if i==0:
                return  # Already at top level, done overall.

            # Backtrack.  Put old last value of permutation, p[i],
            # back into cm.  Variable j gets successor of p[i]
            i = i-1
            j = p[i]
            cm.increment_element(j)
            j = cm.next(j)

def  test_multiset_permutations_pred_impl(s):
    """Runs multiset_permutations_pred_impl on sequence s, and 
    prints out the number of derangements generated and the time required."""

    # Takes a little setup to actually call the functions
    cm = CatMultiSet()
    cm.init_from_seq(s)
    pred = PigeonDerangementPred(cm)

    a = time.perf_counter()
    nl = sum(1 for p in multiset_permutations_pred_impl(cm,pred))
    b = time.perf_counter()

    print("For sequence", s, "Found", nl, "derangements.  Elapsed:", b-a)

if __name__ == "__main__":

    test_multiset_permutations_pred_impl('1112234444445555')
rathmann commented 4 years ago

Just wanted to follow up with a quick benchmark. The string '1112234444445555' has 100900800 permutations. Generating all of them with pypy3 on my laptop takes

21.76261398299539 multiset_permutations_impl (routine posted yesterday) rate of 4636428 permutations per second

55.39078904999769 msp (Takaoka routine) as posted on stack exchange. Rate of 1821616 permutations per second

The above is not quite fair, since the Takaoka routine is copying the data, and my routine is just giving a snapshot of an internal data structure. So, in msp, I replaced yield visit(head) with yield head, and this gave timing of

11.67909508899902 msp (no copy) Rate of 8639436.46584762 permutations per second

So, if what I have done is reasonable, the Takaoka routine is the fastest by a good margin.

For comparison, the existing SymPy routine (with data copying) give a much slower:

elapsed 240.5258202719997 for 100900800 permutations Rate of 419500.9079935613 permutations per second

So either of the newer versions would be a big improvement

rathmann commented 4 years ago

Hello,

I hope @smichr is enjoying an August holiday.

I took another look at the Takaoka paper referenced in the comment to the msp() function. I don't (yet) understand the algorithm, but from the description, it defines an order on the permutations such that each differs from its predecessor by only one swap. This helps the algorithm to produce the next permutation in strictly constant time. (Some papers use the term loopless.)

The version I posted on August 8 generates the permutations in lexicographic order. Since many elements may change places between adjacent permutations in this order, no such algorithm can be loopless.

A bad case for the routine I posted would be a string of the form anb. There are only n+1 permutations of such a sequence, and I would assume the Takaoka algorithm would cycle through them in constant time each. My posted routine would take linear time per permutation for this case, since every time it moves a "b" towards the front of the permuation, it would then move all the 'a' elements that follow it from the multiset back into the permutation.

Presumably, we could come up with a different lexicographic algorithm which would take care of this particular case with less overhead, but I don't know overall what is the worst case in terms of amount of change between adjacent permutations in lexicographic order.

In many cases, I would think that my August 8 routine might still achieve CAT performance. I think this would be the case as long as changes close to the head of the permutation are exponentially less common than changes which only involve elements close to the tail.

Of course, as implementors of SymPy, the fine gradations in O(n) performance between different published algorithms are probably less important than other considerations. I assume most client usage would involve actually reading the permutations, and so their performance will be at best linear per permutation.

rathmann commented 3 years ago

Just a quick check since this issue has been quiet for some months now -- any interest in reviving this?

It seems we have a number of algorithms/implementation which would be significant improvements over what is currently available in SymPy.

smichr commented 3 years ago

Hi @rathmann , shall we start to make changes to the existing algorithms and then think about adding others?

rathmann commented 3 years ago

Hi Chris, glad to hear from you.

I am afraid I am tied up through the end of next week, but would be happy to look at it after that.

If I recall, the ones that I touched were the Vajnovszki ECO algorithms (my iterative modifications of your implementations) and a straightforward (I thought so at least) algorithm for multiset permutations/derangements.

You had a number of other interesting algorithms -- I will have to get back up to speed on those.

smichr commented 3 years ago

Tim Peters left a gem here that computes the minimal set basis-es for a multiset.

e.g.

>>> ms
[1, 1, 2, 3, 3, 3, 4, 4, 5]
>>> for i in allsplit(_):
...  print(i)
...
[{1, 2, 3, 4, 5}, {1, 3, 4}, {3}]
[{1, 2, 3, 4}, {1, 3, 4, 5}, {3}]
[{1, 2, 3, 4}, {1, 3, 4}, {3, 5}]
[{1, 2, 3, 4, 5}, {1, 3}, {3, 4}]
[{1, 2, 3, 4}, {1, 3, 5}, {3, 4}]
[{1, 2, 3, 4}, {1, 3}, {3, 4, 5}]
[{1, 2, 3, 5}, {1, 3, 4}, {3, 4}]
[{1, 2, 3}, {1, 3, 4, 5}, {3, 4}]
[{1, 2, 3}, {1, 3, 4}, {5, 3, 4}]
[{1, 4, 3, 5}, {1, 4, 3}, {2, 3}]
[{1, 4, 3}, {1, 4, 3}, {2, 3, 5}]
[{1, 5, 3, 4}, {1, 3}, {2, 3, 4}]
[{1, 3, 4}, {1, 3, 5}, {2, 3, 4}]
[{1, 3, 4}, {1, 3}, {2, 3, 4, 5}]
rathmann commented 3 years ago

(Referring to stackoverflow link, generating all possible minimal splits of a multiset into different subsets.)

Interesting. Any idea if there is a use case beyond the puzzle value? It is impressive how Peters leverages the Python built-ins to get a very concise solution.

So this could be almost almost:

I say almost since the parts generated are themselves multisets. You could filter to yield only those which turn out to be sets. Better would be to see if the Knuth algorithm can be modified to generate only set-valued parts. The result would of course be much longer than the given solution.

rathmann commented 3 years ago

Since there has been extensive discussion over the years, I thought I would take inventory of the algorithms/implementations under discussion. (Sorry for any ommisions/mistakes. I figure we can edit this post as we find omissions or decide on more.)

In terms of planning, I think my 5 Aug 2020 post is still relevant.

Specials - not generators for sets or multisets

Sets - generators for subsets of permutations

Listed in order from Vajnovszki ECO paper. smichr adapted these to Python, and (except for bell permutations) rathmann reimplemented with faster iterative versions (23 Feb 2020).

Multisets - generators for (subsets of) multiset permutations

rathmann commented 3 years ago

I have been looking at the derangements() function from smichr's Jan 23, 2020 posting. Multiset derangements seems like a good candidate for this next round of changes, since we have two implementaions, either of which would be an improvement over the existing code in SymPy.

I can't say that I understand this function in detail, and so wanted to ask for some more description. I will give a quick overview of my current understanding. @smichr, can you let me know if what follows is (roughly) accurate?

Let's maintain 3 data structures, to describe the current state of the generation process.

At any given point in the computation, we select an element e from P, with multiplicity k. We find those positions into which e could be placed. These are the indexes which have the "open" sentinel value in rv and not equal to e in S. If there are j such positions, there are Choose(j,k) possible ways to place the instances of e into rv. We could enumerate these possibities with itertools.combinations() or the subsets() function in iterables.

For each such placement, we evaluate a recursive subproblem with a copy of rv in which e is filled into the indicated positions, and a copy of the multiset with e deleted.

Smichr's actual implementation avoids overhead from recursion and data copies, but explores the same tree of possibilities in the same order as this recursive scheme. (Please let me know if this is true.)

By ordering the multiset so that the elements with greater multiplicity are placed first, the algorithm seeks to reduce the amount of time it spends exploring blind alleys.

If my understanding is correct, such an ordering of the elements reduces but doesn't completely eliminate blind alleys. Consider the multiset corresponding to the sequence 'aabbcc'. Suppose the instances of a have already been placed, and rv is now [, , a, a, , ], and b is the next element to place. There are Choose(4,2)==6 ways to place b, but only one of them will result in a derangement. Hence, I would call the other 5 placements of b blind alleys. I don't know if such blind alleys are going to be numerous in practice. I imagine that figuring this out would be an interesting exercise in combinatorics, but not one I would like to attempt right now. Alternatively, it shouldn't be too hard to instrument a version of this algorithm and try a bunch of test cases.

smichr commented 3 years ago

I can take a look at this later today. I am imagining some better ways to handle blind alleys with repeated elements that have the same large multiplicity.

Also, I would suggest adding nD to count derangements. And I would make the derangements relative to the initial permutation of the elements as discussed about using something like permof.

rathmann commented 3 years ago

Yes, counting multiset derangements would be a nice addition. I included it in my catalog from a couple of weeks ago.

Mostly at this point I want to see if I understand your derangements() algorithm. I thought I was missing something. For example, if I were to implement a version of this algorithm, my first thought would be to have a stack of itertools.combinations() iterators - one for every element that has already been placed. Your implementation seems to manage with just one active iterator, so I assume it is doing something different.

I was figuring that only when we both understand both proposed algorithms will we be in a position to choose which goes into SymPy.

I think my lexicographic order version is pretty straightforward. (Except perhaps for PigeonDerangementPred, but that is an add-on for performance.) But of course, everyone thinks their own code is especially clear.