a bijectionist's toolkit

mantepse commented 2 years ago

We provide a toolkit for the combinatorialist to help find functions ("statistics") s: A -> Z and bijections A -> B given sequences of finite sets A and B that satisfy various constraints.

Depends on #34881 Depends on #34878

CC: @stumpc5 @tscrim @fchapoton

Component: combinatorics

Author: Alexander Grosz, Tobias Kietreiber, Stephan Pfannerer, Martin Rubey

Branch/Commit: u/mantepse/a_bijectionist_s_toolkit @ 53506aa

Reviewer: Matthias Koeppe, ...

Issue created by migration from https://trac.sagemath.org/ticket/33238

mantepse commented 1 year ago

comment:75

There is now a weird bug with SCIP:

Currently (after adding default_mip_solver("SCIP") at the very beginning, most tests fail with an error like

File "src/sage/combinat/bijectionist.py", line 160, in sage.combinat.bijectionist
Failed example:
    for D in sorted(bij.minimal_subdistributions_iterator(), key=lambda x: (len(x[0][0]), x)):
        ascii_art(D)
Exception raised:
    Traceback (most recent call last):
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 695, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 1093, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.combinat.bijectionist[37]>", line 1, in <module>
        for D in sorted(bij.minimal_subdistributions_iterator(), key=lambda x: (len(x[Integer(0)][Integer(0)]), x)):
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 1901, in minimal_subdistributions_iterator
        new_s = self._find_counter_example(self._A, s, d, False)
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 1967, in _find_counter_example
        bmilp.solve(tmp_constraints)
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 2752, in solve
        self.milp.remove_constraints(new_indices)
      File "sage/numerical/mip.pyx", line 2382, in sage.numerical.mip.MixedIntegerLinearProgram.remove_constraints
        self._backend.remove_constraints(constraints)
      File "sage/numerical/backends/generic_backend.pyx", line 432, in sage.numerical.backends.generic_backend.GenericBackend.remove_constraints
        self.remove_constraint(c)
      File "sage/numerical/backends/scip_backend.pyx", line 360, in sage.numerical.backends.scip_backend.SCIPBackend.remove_constraint
        raise ValueError("The constraint's index i must satisfy 0 <= i < number_of_constraints")
    ValueError: The constraint's index i must satisfy 0 <= i < number_of_constraints

Adding

!#diff
diff --git a/src/sage/combinat/bijectionist.py b/src/sage/combinat/bijectionist.py
index 4da0be9fd9f..661677f78e3 100644
--- a/src/sage/combinat/bijectionist.py
+++ b/src/sage/combinat/bijectionist.py
@@ -38,6 +38,7 @@ A guided tour
     We find a statistic `s` such that
     `(s, wex, fix) \sim (llis, des, adj)`::

+        sage: default_mip_solver("SCIP")
         sage: N = 3
         sage: As = [list(Permutations(n)) for n in range(N+1)]
         sage: A = B = sum(As, [])
@@ -2747,6 +2748,7 @@ class _BijectionistMILP():
             self.last_solution = self.milp.get_values(self._x,
                                                       convert=bool, tolerance=0.1)
         finally:
+            repr(self.milp)
             self.milp.remove_constraints(new_indices)

         # veto the solution, by requiring that not all variables with

this problem goes away, and the output of the tests is OK. What could this be?

mantepse commented 1 year ago

comment:76

Instead of repr(self.milp), doing

            m = self.milp
            b = m.get_backend()
            [b.objective_coefficient(i) for i in range(b.ncols())]

is enough to make things work.

mantepse commented 1 year ago

comment:77

            m = self.milp.get_backend()._get_model()
            if m.getStatus() != 'unknown':            
                m.freeTransform()

is also enough.

mantepse commented 1 year ago

comment:78

This raises yet another question: according to the documentation, https://scipopt.github.io/PySCIPOpt/docs/html/classpyscipopt_1_1scip_1_1Model.html#a13c33bbfda259edd6249879a8cc3bcb8:

Frees all solution process data including presolving and transformed problem, only original problem is kept

But this means that adding and removing constraints again is really expensive, no?

mkoeppe commented 1 year ago

comment:79

Replying to Martin Rubey:

This raises yet another question: according to the documentation, https://scipopt.github.io/PySCIPOpt/docs/html/classpyscipopt_1_1scip_1_1Model.html#a13c33bbfda259edd6249879a8cc3bcb8:
Frees all solution process data including presolving and transformed problem, only original problem is kept
But this means that adding and removing constraints again is really expensive, no?

Yes.

mkoeppe commented 1 year ago

comment:80

Replying to Martin Rubey:

            m = self.milp.get_backend()._get_model()
            if m.getStatus() != 'unknown':            
                m.freeTransform()

is also enough.

Can you find a fix for scip_backend.pyx?

mantepse commented 1 year ago

comment:81

No, I have no idea what's going on, because we have

    cpdef remove_constraint(self, int i):
...
        if i < 0 or i >= self.nrows():
            raise ValueError("The constraint's index i must satisfy 0 <= i < number_of_constraints")
        if self.model.getStatus() != 'unknown':
            self.model.freeTransform()
        self.model.delCons(self.model.getConss()[i])

and using this method (instead of using remove_constraints) fails also.

mkoeppe commented 1 year ago

comment:82

Unrelated comment:

+        # convert input to set of block representatives
+        blocks = set()
+        if p in self._A:
+            blocks.add(self._P.find(p))
+        elif type(p) is list:  # TODO: this looks very brittle
+            for p1 in p:
+                if p1 in self._A:
+                    blocks.add(self._P.find(p1))
+                elif type(p1) is list:
+                    for p2 in p1:
+                        blocks.add(self._P.find(p2))

Instead of type(...) is list, you can use isinstance(..., collections.abc.Iterable) - https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes

Also after each the elif blocks, there should probably be an else block that raises an error.

mantepse commented 1 year ago

comment:83

Replying to Matthias Köppe:

Instead of type(...) is list, you can use isinstance(..., collections.abc.Iterable) - https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes

I don't think so, because elements of A might be iterable. We require the objects to be hashable, so we can allow lists.

mkoeppe commented 1 year ago

comment:84

OK, I'll take care of #21003 comment:139 first and then try to reproduce comment:75

mkoeppe commented 1 year ago

comment:85

I've pushed a branch with merged dependencies to public/33238. Can't reproduce the problem

mkoeppe commented 1 year ago

comment:86

Unrelated:

            # function adding a solution to dict of solutions
            def add_solution(solutions, solution):
                for p, value in solution.items():
                    if p not in solutions:
                        solutions[p] = set()
                    solutions[p].add(value)

            [...]

            solutions = {}
            add_solution(solutions, solution)

This can be simplified using solutions = defaultdict(set)

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

2d34911 slightly simplify logic of _forced_constant_blocks, use defaultdict

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 64101a8 to 2d34911

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

c2f0062 slight simplification

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 2d34911 to c2f0062

mkoeppe commented 1 year ago

comment:89

Replying to Matthias Köppe:

Replying to Martin Rubey:

But this means that adding and removing constraints again is really expensive, no?

Yes.

I think it would make sense to rework it so that you only make copies and add constraints, rather than trying to remove constraints. This would probably simplify the logic of the code (helping to get rid of the notion of "temporary constraints").

And it also makes sense from the viewpoint of the polyhedral enumeration method: Some polyhedral computation backends (including Normaliz) have some support for adding constraints (although this is not exposed yet in Sage's Polyhedron classes)

mantepse commented 1 year ago

comment:90

Well, currently removing is much faster, even after the patch in #21003.

The difference in code is quite local - it only affects the _BijectionistMILP.solve method.

So, if possible I'd rather concentrate on the design and, if this is reasonable, get it into sage in the hope that some interesting use cases pop up. I am using it in my day to day research, but of course, that's a rather biased view.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

78968ae copy (instead of deepcopy) should be correct

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from c2f0062 to 78968ae

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

6ddaeae slightly simplify _preprocess_intertwining_relations

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 78968ae to 6ddaeae

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 6ddaeae to 60286e6

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

60286e6 slightly improve non_copying_intersection

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

c38f5b9 add some internal documentation

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 60286e6 to c38f5b9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

6fb06f0 untangle _preprocess_intertwining_relations

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from c38f5b9 to 6fb06f0

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 6fb06f0 to 706056e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

706056e add possibility to require a homomesy

mantepse commented 1 year ago

comment:97

concerning performance - it would be very nice if we could improve tasks like the following:

sage: default_mip_solver("SCIP")
sage: from sage.combinat.cyclic_sieving_phenomenon import orbit_decomposition
sage: f = findstat()
sage: f._allow_execution=True
sage: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:def rotate_permutation(p):
:    cycle = Permutation(tuple(range(1, len(p)+1)))
:    return Permutation([cycle.inverse()(p(cycle(i))) for i in range(1, len(p)+1)])
:
:--
sage: s = findstat(1377)
sage: n=5; A = B = Permutations(n)
sage: Q = orbit_decomposition(A, rotate_permutation)
sage: bij = Bijectionist(A, B, s)
sage: bij.set_homomesic(Q)
sage: %time bij.constant_blocks(optimal=True)
CPU times: user 9.12 s, sys: 40 ms, total: 9.16 s
Wall time: 9.17 s
{{[1, 2, 3, 4, 5], [2, 3, 4, 5, 1], [3, 4, 5, 1, 2], [4, 5, 1, 2, 3], [5, 1, 2, 3, 4]}}
sage: it = bij.minimal_subdistributions_iterator()
sage: next(it)
([[5, 1, 2, 3, 4]], [0])
sage: next(it)
([[4, 5, 1, 2, 3]], [0])
sage: next(it)
([[3, 4, 5, 1, 2]], [0])
sage: next(it)
([[2, 3, 4, 5, 1]], [0])
sage: next(it)
([[1, 2, 3, 4, 5]], [0])
sage: %time next(it)
... takes "forever" ...

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 706056e to 4e90280

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

`5bd8f39`	`MixedIntegerLinearProgram.add_constraint: Return row index`
`e7df521`	`MixedIntegerLinearProgram.add_constraint: Do not refuse to add empty constraints; they could be infeasible or we may need their row index for later`
`d1fad0c`	`src/sage/features/mip_backends.py: Add CVXOPT feature`
`c57da31`	`return indices of added rows only on demand and return a list`
`cfe24fe`	`add a doctest for adding and removing constraints`
`6c34bc3`	`make linter happy`
`3b929f8`	`forgot newline in docstring`
`d8a0663`	`Merge branch 'u/mantepse/mixedintegerlinearprogram_add_constraint__return_row_index' of trac.sagemath.org:sage into t/33238/a_bijectionist_s_toolkit`
`b03cf22`	`take care of the corner case for all backends in the frontend`
`4e90280`	`Merge branch 'u/mantepse/allow_to_remove_no_constraints' of trac.sagemath.org:sage into t/33238/a_bijectionist_s_toolkit`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 4e90280 to 65b0c4a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

65b0c4a preserve the cache of solutions after computing the optimal constant blocks

mantepse commented 1 year ago

comment:100

I don't think I can improve this much further now. Please review!

mkoeppe commented 1 year ago

comment:101

I'd suggest to combine "solve" and "solution" into one method to eliminate some state from the interface

mkoeppe commented 1 year ago

comment:102

... and to eliminate solution_index

mantepse commented 1 year ago

comment:103

Do you mean to combine solve and solution into something like the following?

def solution(self, on_blocks, temporary_constraints, new_solution):

I agree that this makes sense.

I don't see how to eliminate solution_index entirely (other than making it Boolean). There are three (slightly) differen clients:

get any (initial) solution
get a solution with some additional, temporary, constraints
get a new solution

The last version is needed in the solutions_iterator. It differs from 1., because it may be quite hard to find a new solution, but easy to get just any.

I could also offer two different methods, solution and new_solution, but this would split the code in a very artificial way - solution may need to call new_solution with temporary constraints.

mantepse commented 1 year ago

comment:104

It turns out that just new instead of solution_index is not enough, because I want to be able to guarantee the same order of solutions, see doctests of solution_iterator.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 65b0c4a to 21cb54f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

21cb54f make _BijectionistMILP.solution the only entrypoint

mkoeppe commented 1 year ago

comment:106

Replying to Martin Rubey:

It turns out that just new instead of solution_index is not enough, because I want to be able to guarantee the same order of solutions, see doctests of solution_iterator.

Then why not push the solution generator from Bijectionist down to _BijectistMILP?

mantepse commented 1 year ago

comment:107

Replying to Matthias Köppe:

Replying to Martin Rubey:

It turns out that just new instead of solution_index is not enough, because I want to be able to guarantee the same order of solutions, see doctests of solution_iterator.

Then why not push the solution generator from Bijectionist down to _BijectistMILP?

Why would this get rid of solution_index?

Apart from that, I repeat that it is not at all clear to me whether there should be one class or two classes, or, more generally, which methods should be in which class.

The general idea so far was, that the Bijectionist.set-methods only normalize the data, the _BijectionistMILP class sets up the constraints, and the other methods in Bijectionist retrieve solutions (of various kinds) from the _BijectionistMILP class.

It seems desirable that after retrieving certain kinds of solutions (e.g., constant_blocks(optimal=True), the MILP should be rewritten. However, I don't know in which cases this makes sense. For example, currently this is not done for possible_values(optimal=True), because I was unable to find a situation where this makes sense. I also suspect that this might also depend a lot on the solver and how we use it.

In any case, maybe we should be more radical and move even more stuff into _BijectionistMILP - i.e., make the Bijectionist mainly a wrapper for methods from the _BijectionistMILP class?

One further remark: the method Bijectionist.statistics_fibers is probably one of the most useful methods, and does not involve any MILP at all. It is important that this is lightweight.

mkoeppe commented 1 year ago

comment:108

Replying to Martin Rubey:

Replying to Matthias Köppe:

Replying to Martin Rubey:

It turns out that just new instead of solution_index is not enough, because I want to be able to guarantee the same order of solutions, see doctests of solution_iterator.

Then why not push the solution generator from Bijectionist down to _BijectistMILP?

Why would this get rid of solution_index?

Asking the model to provide an iterator for solutions is a better interface than asking it to provide the n-th solution if iterating through the solutions is all you need.

mkoeppe commented 1 year ago

comment:109

See also #34888 for a general design of a solution enumerator given a MixedIntegerLinearProgram.

mantepse commented 1 year ago

comment:110

Replying to Matthias Köppe:

Then why not push the solution generator from Bijectionist down to _BijectistMILP?

Why would this get rid of solution_index?

Asking the model to provide an iterator for solutions is a better interface than asking it to provide the n-th solution if iterating through the solutions is all you need.

Althout I do not understand how this answers my question, I assume it might be better to push down the implementations of solutions_iterator, minimal_subdistributions_iterator, minimal_subdistribution_blocks_iterator, possible_values and constant_blocks.

Do you agree, and would this be a step towards a positive review?

mkoeppe commented 1 year ago

comment:111

I think it is better to think about _BijectionistMILP as a class that manages one family of MIP models - the one set up by its __init__ method and possibly additional ("temporary") constraints. The class translates between the natural presentation of solutions and the technical presentation (encoding) that is used by the MIP model.

The method minimal_subdistributions_iterator uses a different, unrelated MIP model. It should not be managed by the same class. If you want, you can create a separate class for it.

Likewise for minimal_subdistributions_blocks_iterator.

mantepse commented 1 year ago

comment:112

So, do I understand correctly that I should only move the implementation of solutions_iterator?

mkoeppe commented 1 year ago

comment:113

Yes

sagemath / sage

a bijectionist's toolkit #33238