a bijectionist's toolkit

mantepse commented 2 years ago

We provide a toolkit for the combinatorialist to help find functions ("statistics") s: A -> Z and bijections A -> B given sequences of finite sets A and B that satisfy various constraints.

Depends on #34881 Depends on #34878

CC: @stumpc5 @tscrim @fchapoton

Component: combinatorics

Author: Alexander Grosz, Tobias Kietreiber, Stephan Pfannerer, Martin Rubey

Branch/Commit: u/mantepse/a_bijectionist_s_toolkit @ 53506aa

Reviewer: Matthias Koeppe, ...

Issue created by migration from https://trac.sagemath.org/ticket/33238

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 21cb54f to 6868261

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

`a573bb4`	`remove unnecessary calls to list in doctests`
`6868261`	`move iterator over all solutions to _BijectionistMILP`

mkoeppe commented 1 year ago

comment:115

Now try to merge _solve, solution, and __iter__ into a single method and get rid of last_solution and index/solution_index

mantepse commented 1 year ago

comment:116

Do you know how to do this? It is quite stressful for me to deal with this, although it is certainly important to me to get this ticket into sage, and I must say that it looks like an implementation detail to me. In fact, it smells like premature optimization. Anyway.

We have a cache of solutions.

The iterator is supposed to yield from this cache, and only if it is at the end of the cache, generate new solutions. However, other methods may also request that solutions - with special properties - are added to the cache.

mkoeppe commented 1 year ago

comment:117

I'd say the cache is a workaround (premature optimization) for a very inefficient method to generate all feasible solutions of a MIP model.

mantepse commented 1 year ago

comment:118

Ah, no, I don't think so. Some of these problems have very few solutions, and I want to look at all of them. However, frequently we wil have a huge number of solutions (just try the "benchmark" problem with N=6 or N=7, of which only few solutions are interesting.

As I mentioned before, generating all solutions is not really what the tool is about. The most interesting methods is probably minimal_subdistributions_iterator (whereas the method which is likely to be used most frequently is just statistics_fibers).

The cache allows the user to switch between these methods. For example, I might first want to check that a solution exists (this was the main motivation, using the "benchmark" example, which is now https://arxiv.org/abs/2004.01140). However, it may then be interesting to see whether we can compute at least a few minimal subdistributions, etc.

This is the reason why I am also so disappointed that adding and removing constraints in SCIP forgets all of the information accumulated so far.

mkoeppe commented 1 year ago

comment:119

Basically my recommendation is to check whether for any of the MIPs for which you need more than 1 solution, the iterated MIP solving with "veto constraints" is faster than polyhedral vertex enumeration & filtering.

mkoeppe commented 1 year ago

comment:120

Maintaining a solution candidate pool when you solve many related problems certainly has merit. But using an index to refer to an initial segment of the cache is a very awkward interface. Hence my suggestion to just have a method like the following:

def solutions_iterator(self, on_blocks, additional_constraints)

By writing it as a generator, the control and data flow can be expressed naturally.

If you only need 1 solution, just call it as next(bmilp.solutions_iterator(...)).

mkoeppe commented 1 year ago

comment:121

Replying to Martin Rubey:

The most interesting methods is probably minimal_subdistributions_iterator

Do you have a computationally nontrivial example of using this one?

mantepse commented 1 year ago

comment:122

Replying to Matthias Köppe:

Replying to Martin Rubey:

The most interesting methods is probably minimal_subdistributions_iterator

Do you have a computationally nontrivial example of using this one?

It is not so interesting anymore, since the problem has been solved, but you could look at the "benchmark example" for larger N, perhaps 7 or 8 (with minimal_subdistributions_blocks_iterator). For N=6 the output is just 18 lines, and takes about 40 seconds on my laptop. In this case, just 104 solutions of the original problem are generated (and almost all the computation goes into the milp set up in line 2075).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

61e97bc merge _solve, solution and __iter__

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 6868261 to 61e97bc

mkoeppe commented 1 year ago

comment:124

Nice!

mkoeppe commented 1 year ago

comment:125

+                    # moving this out of the try...finally block breaks SCIP
+                    solution = self.milp.get_values(self._x,
+                                                    convert=bool, tolerance=0.1)

Does get_values throw an error?

mantepse commented 1 year ago

comment:126

Yes, Warning: method cannot be called before problem is solved.

(and I agree that it is better - it took me just 3 hours)

mkoeppe commented 1 year ago

comment:127

The finally block removes the transformed problem -- it's fair that it forgets the solution

mkoeppe commented 1 year ago

comment:128

Now that you have the clean interface solutions_iterator, you can write an alternative implementation using polyhedral vertex enumeration

mantepse commented 1 year ago

comment:129

Yes. Removing the other stuff, i.e.,

                    b = self.milp.get_backend()
                    if hasattr(b, "_get_model"):
                        m = b._get_model()
                        if m.getStatus() != 'unknown':
                            m.freeTransform()

gives me

        self.milp.remove_constraints(new_indices)
      File "sage/numerical/mip.pyx", line 2382, in sage.numerical.mip.MixedIntegerLinearProgram.remove_constraints
        self._backend.remove_constraints(constraints)
      File "sage/numerical/backends/generic_backend.pyx", line 432, in sage.numerical.backends.generic_backend.GenericBackend.remove_constraints
        self.remove_constraint(c)
      File "sage/numerical/backends/scip_backend.pyx", line 360, in sage.numerical.backends.scip_backend.SCIPBackend.remove_constraint
        raise ValueError("The constraint's index i must satisfy 0 <= i < number_of_constraints")
    ValueError: The constraint's index i must satisfy 0 <= i < number_of_constraints

mantepse commented 1 year ago

comment:130

Replying to Matthias Köppe:

Now that you have the clean interface solutions_iterator, you can write an alternative implementation using polyhedral vertex enumeration

I must not do this before somebody uses the tool, or I have completed my other duties. My fantasy was to finish this ticket before Christmas (last year :-)

But I am very grateful for your feedback and time!

mkoeppe commented 1 year ago

comment:131

Replying to Martin Rubey:

Yes. Removing the other stuff, i.e.,

                    b = self.milp.get_backend()
                    if hasattr(b, "_get_model"):
                        m = b._get_model()
                        if m.getStatus() != 'unknown':
                            m.freeTransform()

gives me

        self.milp.remove_constraints(new_indices)
      File "sage/numerical/mip.pyx", line 2382, in sage.numerical.mip.MixedIntegerLinearProgram.remove_constraints
        self._backend.remove_constraints(constraints)
      File "sage/numerical/backends/generic_backend.pyx", line 432, in sage.numerical.backends.generic_backend.GenericBackend.remove_constraints
        self.remove_constraint(c)
      File "sage/numerical/backends/scip_backend.pyx", line 360, in sage.numerical.backends.scip_backend.SCIPBackend.remove_constraint
        raise ValueError("The constraint's index i must satisfy 0 <= i < number_of_constraints")
    ValueError: The constraint's index i must satisfy 0 <= i < number_of_constraints

Not exactly sure what's happening there, but try if it's still necessary if you merge the (just updated) #34890

mantepse commented 1 year ago

comment:132

No, this gives me several errors. For example:

File "src/sage/combinat/bijectionist.py", line 1149, in sage.combinat.bijectionist.Bijectionist.set_distributions
Failed example:
    bij.constant_blocks(optimal=True)
Exception raised:
    Traceback (most recent call last):
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 695, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 1093, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.combinat.bijectionist.Bijectionist.set_distributions[6]>", line 1, in <module>
        bij.constant_blocks(optimal=True)
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 647, in constant_blocks
        self._forced_constant_blocks()
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 1705, in _forced_constant_blocks
        updated_multiple_preimages[tZ + (solution[p],)].append(p)
    KeyError: [1, 3, 2]
**********************************************************************
File "src/sage/combinat/bijectionist.py", line 1151, in sage.combinat.bijectionist.Bijectionist.set_distributions
Failed example:
    sorted(bij.minimal_subdistributions_blocks_iterator(), key=lambda d: (len(d[0]), d[0]))
Exception raised:
    Traceback (most recent call last):
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 695, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/martin/sage-develop/src/sage/doctest/forker.py", line 1093, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.combinat.bijectionist.Bijectionist.set_distributions[7]>", line 1, in <module>
        sorted(bij.minimal_subdistributions_blocks_iterator(), key=lambda d: (len(d[Integer(0)]), d[Integer(0)]))
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 2120, in minimal_subdistributions_blocks_iterator
        add_counter_example_constraint(s)
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 2093, in add_counter_example_constraint
        minimal_subdistribution.add_constraint(sum(D[p] for p in P
      File "/home/martin/sage-develop/src/sage/combinat/bijectionist.py", line 2094, in <genexpr>
        if s[p] == v) == V[v])
    KeyError: [1, 2, 3]

I have to stop now, it's 2:30am.

mkoeppe commented 1 year ago

Reviewer: Matthias Koeppe, ...

mkoeppe commented 1 year ago

comment:134

This is now OK from my side, but I've really only looked at the code from the viewpoint of managing the MIPs and getting solutions out of them. So it would be good if other reviewers could take a look on the combinatorial statistics side of things.

The code-style police will say that

+        evaluate = lambda f: sum(coeff if index == -1 else
+                                 coeff * values[index_block_value_dict[index]]
+                                 for index, coeff in f.dict().items())

should be a nested def instead.

Since the linter is currently broken, I'd recommend to run ./sage -tox -- src/sage/combinat/bijectionist.py and fix any reported issues

mantepse commented 1 year ago

comment:135

Thank you! ./sage -tox -- src/sage/combinat/bijectionist.py does not emit any warnings on my machine.

On the combinatorics side, there is still the issue with the name of set_pseudo_inverse_relation, which is currently as silly as can be, but I don't know any better and did not yet receive an answer on https://mathoverflow.net/q/437261.

mkoeppe commented 1 year ago

comment:136

You can also try ./sage -tox -e pycodestyle -- src/sage/combinat/bijectionist.py

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

db8947b pycodestyle stuff

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 61e97bc to db8947b

mantepse commented 1 year ago

comment:138

I think it will be best to replace pseudo_inverse_relation with quadratic_relation, unless somebody comes up with a better name.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

9678717 Merge branch 'develop' of trac.sagemath.org:sage into t/33238/a_bijectionist_s_toolkit

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from db8947b to 9678717

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Changed commit from 9678717 to 53506aa

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 1 year ago

Branch pushed to git repo; I updated commit sha1. New commits:

53506aa rename pseudo_inverse to quadratic

sagemath / sage

a bijectionist's toolkit #33238