MiniZinc / libminizinc

The MiniZinc compiler
http://www.minizinc.org
Other
510 stars 80 forks source link

Feature request: arrays of arrays with non-equal size #626

Open zayenz opened 1 year ago

zayenz commented 1 year ago

In some cases it is common to have a list of lists of non-equal length as the data. Representing this in MiniZinc can be awkward and require a lot of hacky code. When the sub-lists have no repetitions sets can be used, but when repetitions may occur

The currently simplest way is to pad all lists to contain an equal number of elements, and use a sentinel-value (for example absent <>), but this is messy and often takes some time to get right.

It would be great if it was possible to define a type

array[int] of array[int] of int: my_list_of_lists = [
  [1, 2, 3],
  [1, 2],
  [1, 1, 2, 2, 3, 3]
];
Dekker1 commented 1 year ago

There has been talk about supporting this feature directly in MiniZinc, especially since the upcoming MiniZinc release will have support for tuples and records. This means that the workaround could already change to:

array[_] of tuple(array[_] of int): my_list_of_lists = [
  ([1, 2, 3],),
  ([1, 2],),
  ([1, 1, 2, 2, 3, 3],),
];

Which (for example) could then be accessed as my_list_of_lists[3].1[6] to get the last element. This means that the functionality will be all there, but you are currently stuck with the extra .1 (tuple accessor) and the tuple in the type, and internally the tuples also add an extra layer in the representation.

Direct support for this feature is slightly tricky since we currently cannot represent this type in our compact Type objects (which are not easy to change). Then would can choose whether we want to directly support this feature in the implementation (and save on indirection in data representation), or whether we rewrite them into tuples for consistency.

zayenz commented 1 year ago

The tuple work-around, while not pretty, will solve the immediate problem. Thanks for that idea.

Mommessc commented 1 year ago

I've come across this issue as well, is there a workaround to have an array of var int inside the tuple? (but the size of each inner array would be fixed) When I tried array[_] of tuple(array[_] of var int) I got the error "MiniZinc: evaluation error: not a set of int expression". I think it comes from the inner array that has undefined index range, which is not possible.

Dekker1 commented 1 year ago

@Mommessc The idea is that you would be able to create a model similar to:

array[_] of int: len = [1,5,3,4];
array[_] of tuple(array[_] of var int): x ::output = [ ([let {var int: i} in i | _ in 1..l],) | l in len];

I did however discover a bug in the output processing when testing this model, which will appear on the develop branch shortly.

Note that your declaration without a right hand side would not be valid as the lengths of the arrays should still be known at compile time.

Mommessc commented 1 year ago

This solution seems promising, I'll wait for the develop branch then.

Dekker1 commented 1 year ago

The fix was merged in https://github.com/MiniZinc/libminizinc/commit/c556c47a1d40b2b80666f9a550409c1bf1894e50. Although I notice that I forgot to add my test case, which CI should merge in in a few minutes.

Mommessc commented 1 year ago

Thanks for the update, it works as expected.

For the record, if someone passes by this issue with similar use case, I did a few tests to evaluate the time needed to solve a small problem with the 3 modelling options: 1) Using the opt values <> for padding the smaller arrays 2) Using a default value (-1 in my case as I deal only with positive integers) for padding the smaller arrays 3) Using the trick with the tuple of array

What I observed is that solutions 2) and 3) took similar time to solve, while solution 1) took longer time to solve. However I was limited in my tests to arrays of very small size, like 7x5. I expect with larger and sparse arrays with a lot of padding involved the solution 3) would be quicker than solution 2).