MiniZinc / libminizinc

The MiniZinc compiler
http://www.minizinc.org
Other
510 stars 80 forks source link

enhancement request: need API access to generate integer sequences for number of combinatorial objects #608

Closed dan1221 closed 2 years ago

dan1221 commented 2 years ago

Sequences at oeis.org often show a Mathematica implementation, which often begins with a "Table" function, such as: Table[(n^2, {n, 1, 9}] https://www.wolframalpha.com/input?i=Table%28n%5E2%29%2C+%7Bn%2C+1%2C+9%7D%5D

However, sequences at oeis.org that show MiniZinc implementations, such as: https://oeis.org/A000170/a000170.mzn.txt often resort to less elegant command-line incantations with pipes and filters, such as:

% Usage:
% minizinc -a --soln-sep "" --search-complete-msg "" -D"n=8;" queen.mzn 2>/dev/null | awk 'NF' | sort

This is related to issue #607 "Counting solutions".

Dekker1 commented 2 years ago

I'm not sure how the Table wolfram function relates to getting all solutions from a MiniZinc model. Table in Wolfram is equivalent to an array comprehension in MiniZinc. (Table[(n^2), {n, 1, 9}] is equivalent to [pow(n,2) | n in 1..9])

I'm not very sure what the part is that you are missing. Trying the command the piping actually only sorts the solutions. This could also have been done with the --canonicalize flag. The sorting might be why the separators were removed, so minizinc -a --canonicalize -D"n=8;" queen.mzn should give you the exact same information.

Could you elaborate on the functionality that you are missing?

dan1221 commented 2 years ago

I'm missing a convenient way to use MiniZinc to compute and output integer sequences such as at oeis.org, where the sequence is the number of combinatorial objects of some particular type for sizes 1..10. Should be able to declare a type of combinatorial object, such as an involution permutation, and then for n in 1..10 print the number of instances that satisfy the constraints.

% here's the idea, don't worry about the syntax

for n in 1..10 loop
   type a is array[1..n] of 1..n;
   type permutation is a | alldifferent;
   type involution(x) is permutation | x = inverse(x);
   output involution.number_of_solutions; 
end loop;
Dekker1 commented 2 years ago

At the moment this would be something for which you would use MiniZinc Python. You could then do something like:

from minizinc import Instance, Model, Solver

chuffed = Solver.lookup("chuffed")

permutation = Model()
permutation.add_string(
    """
    int n;
    array[1..n] of 1..n: x;
    constraint all_different(x);
    """
)
instance = Instance(chuffed, permutation)

count = 0
for i in range(1, 11):
    count += len(instance.solve(all_solutions=True))

print(f"num permutations: {count}")
zayenz commented 2 years ago

As @Dekker1 says, counting solutions and running for different inputs is essentially a meta-level property. MiniZinc in itself is not actually a programming language although it looks like one. What it does is it specifies an existential problem to solve, not it's counting variant. It can then be used as a "subroutine" to find these features, and the two typical ways to do that is via scripting and running the executable or using the Python interface.

Apart form that, I wanted to chime in on what solution counting even means. Consider the model

int: n = 10;
set of int: Domain = 1..n;

var Domain: x;
var Domain: y;
var Domain: z;

constraint x < y /\ y < z;

solve satisfy;

which is a simple problem that has 120 solutions. Now, perhaps the y variable is just used as a witness by the modeller to indicate that there is a value in-between x and z (in real models, it is not uncommon to have variables where solutions re equivalent for different values). Adding the output-item

output [show([x, z])];

changes the solution count to 36, as only the solutions which are unique for the output-variables are counted.

This is not the whole picture though. Adding a search annotation into the mixture

solve :: int_search([x, y, z], input_order, indomain_min) satisfy;

while keeping the above output item will make y relevant again for some solvers. Gecode will produce 120 solutions (and indicate so in the statistics), but minizinc will only print the unique solutions. Chuffed on the other hand will produce only the 36 solutions that are unique for x and z.

dan1221 commented 2 years ago

Thanks for the pointer to MiniZinc Python. It looks like it solves the issue of being able to print a sequence of solution counts based on different parameters.

But I still claim that MiniZinc itself should be able to do this. Zayenz wrote

"counting solutions and running for different inputs is essentially a meta-level property. MiniZinc in itself is not actually a programming language although it looks like one."

What MiniZinc is or isn't seems like a debatable philosophical problem. I do recognize the notion of separation of concerns, where if you need scripting you can defer that to Python rather than bloating MiniZinc with builtin scripting operations. But the issue seems a lot simpler than that. MiniZinc has an API, and Python can access it, but MiniZinc can't access its own API! If a MiniZinc model could access the API, it could get hold of the number of solutions, just as Python can do, and presumably it could get hold of the number of solutions multiple times, based on different values of a parameter, just as Python can do. And it already knows how to output a sequence of values stored in an array. So I don't envision adding any significant machinery other than the ability to access the API to get solution counts, and to restart the model with different parameter values. I contend this is well within the spirit of what a constraint solver should do, and certainly matches a potentially high visibility use case of making MiniZinc suitable for publishing models that generate combinatorial sequences such as those at oeis.org.

zayenz commented 2 years ago

I would argue that it is important to keep the separation of the modelling layer as in MiniZinc and the meta-layer of the solving process separated. I think that MiniZinc is hard enough to develop as it is.

As a simple counter-argument for including solution counts, there are local search back-ends and other incomplete solvers that may not be able to guarantee any solution, let alone all. Managing differences between solver capabilities is tricky already.

Having a small python stub that can be re-used for taking a model file and a value for n that spits out the model count using a complete solver that can enumerate all solutions is I think a more elegant solution, as the semantics of mixing an existential model as in MiniZinc with meta-level properties of the solving process makes my head hurt 😀

dan1221 commented 2 years ago

It seems to me that whatever difficulties there are managing differences between solvers would be no different than what Python encounters when using the MultiZinc API.

I'm not in a good position to judge the technical difficulties of implementing this, but I do think it has benefits. It would be perfect for several combinatorial examples in the Handbook, such as the n-queens problem and latin squares.

I understand the elegance of separation of concerns, but there is a tradeoff vs elegance of the model, where using a single language is more elegant than a hybrid mixture of two. I suppose one could use a shell script or perl script to generate one data file for each possible parameter, and then execute the model once for each data file, and then extract the solution counts from the statistical output and print the results. That is elegant in terms of separation of concerns, but the code isn't something you'd want to publish to show off the power of your modeling language. Similarly, including MiniZinc code in a Python program gets the job done, but is overkill just to output a sequence of solution counts.

I suspect that having access to solution counts from within a MiniZinc model would solve some of the problems that cause people to use MiniZinc with Python, besides just wanting to output a sequence of solution counts. For example, it may come in handy when trying to find the average number of solutions over a random assignment of variable values.

Dekker1 commented 2 years ago

What you seem to be looking for seems to be very close to Constraint Logic Programming. In CLP systems you would be able to do exactly what you want to do. However, it was found that these systems often required such tight integration with a solver to be effective, and, since you might need very different types of solvers for different types of problems, these systems where hard to develop and maintain.

MiniZinc was designed to be a more minimal approach. A model describes the problem independent of the intended solver, an instance of the model is rewritten once into an equisatisfiable solver-level specification, and the solver then produces solutions (which are translated back to the original model level by MiniZinc). This process has proven to be very successful, but as you noticed it is then sometimes restricted by the language of the solvers constraints and functionality of the solver.

In the last few years, we've been researching gaining back some of the functionality that CLP had, but without giving up a lot of the benefits that the current architecture brings. Our aim has mostly been to work around some of the limitations of different solvers and to bring more advanced search methods to the language, but it seems like the functionality that you are talking about is very similar.

This process started with MiniSearch, a search language extension to MiniZinc. This did mostly what you are talking about, adding a type of scripting layer to the MiniZinc language, allowing users to specify a process of "dealing" with the model. In my opinion MiniSearch had two big downsides, it required tight integration with the solver (like CLP), and the search language was often hard to comprehend (often requiring hours of debugging for me).

The process has since led us down two different roads. We developed MiniZinc Python as scripting side to MiniZinc. To me it feels a lot easier to express the intended process of a model in a imperative language, and it has been designed to be compatible with a newly designed incremental interface to MiniZinc.

We also looked at a simpler way to express meta-search heuristics, that were the goal in MiniSearch. We noticed that most common heuristics were (or could be) expressed as restart-based search, and designed a MiniZinc language extension based on this that only required minimal integration with solver, but still had good solver performance. Different from all other methods, it still only requires the model to be rewritten to the solver-level once. This method is described in a paper called "Solver-independent Large Neighbourhood Search", and further expanded in my thesis ("A Modern Architecture for Constraint Modelling Languages"). We still hope to release this when we have an alternative (incremental) interface for the solvers that do not support the extension. I'm afraid, however, that the this approach would not cover your examples.

All-in-all, these incremental methods are some of the primary things we are working on, but the API that you are looking for will likely not be available in the language itself. Instead we hope that MiniZinc Python offers a good alternative.