disco-lang / disco

Functional teaching language for use in a discrete mathematics course
Other
161 stars 22 forks source link

Generalized combinatorial operators #126

Open byorgey opened 6 years ago

byorgey commented 6 years ago

Right now we have a choose operator which works on natural numbers, e.g.

Disco> 5 choose 2
10

However, what this is really doing is counting the elements of a certain set, i.e. the number of size-2 subsets of a set of size 5. So there is a natural generalization of this operator to act on sets, e.g.

Disco> {2,5,6,8,9} choose 2
{{2,5}, {2,6}, {2,8}, {2,9}, {5,6}, {5,8}, {5,9}, {6,8}, {6,9}, {8,9}}

It would also be natural for it to act on multisets and lists as well (in which case it should return a multiset rather than a set), for example:

Disco> [1,1,2] choose 2
{# [1,1] # 1, [1,2] # 2 #}

Going down a similar path, the factorial operator can be generalized to return all permutations of a container, e.g.

Disco> [1,2,3]!
{[1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], [3,2,1]}
byorgey commented 2 years ago

As a first step, we should for sure add the ability for choose to act on a set. I am not as sure about the other generalizations.

Note that choose has already been generalized to multinomial coefficients, e.g. 10 choose [2,2,3,3]. Applied to sets, I suppose this should return a set of partitions, e.g.

Disco> {2,5,6,8,9} choose [2,2]
{{{2,5}, {6,8}}, {{2,5}, {6,9}}, {{2,5}, {8,9}}, {{2,6}, {5,8}}, ...}
byorgey commented 2 years ago

https://github.com/disco-lang/disco/blob/8cdb3c914805c400cd5aae2f32302e9fe6882335/src/Disco/Typecheck.hs#L815-L821

One difficulty is that the output type would now depend on the input type(s). I suppose we would have to simply generate a single type variable for the entire thing and then generate a COr constraint saying that it is either equal to N * N -> N, or N * List(N) -> N, or Set(a) * N -> Set(Set(a)) (and also possibly Set(a) * List(N) -> Set(Set(Set(a)))). We would have to play around to see how annoying that is in practice. It will probably be a lot more reasonable once we merge #388.

This also makes me realize that the second argument for a multinomial coefficient should really be a bag, not a list! The order of elements does not matter. https://github.com/disco-lang/disco/issues/352