Closed maisonobe closed 2 years ago
I think using the rosen iterator as a basis should be faster:
Often we don't need the complete partition as a full list for example in "pattern matching". A "step visitor pattern" can decide, if we stop at a "matching iteration":
You are right, this is exactly what I needed!
I was going to refactor the algorithm to use a Stream API which would have allowed to filter the partitions upon generation.
Would you allow me to copy this code and put it in Hipparchus by changing its license from GPLV3 to Apache 2?
Would you allow me to copy this code and put it in Hipparchus by changing its license from GPLV3 to Apache 2?
Yes, no problem
Upon testing, it doesn't seem to be equivalent to my needs, so I will keep both implementations.
The Rosen algorithm gives the count, so for example RosenNumberPartitionIterator(4, 2)
generates the three arrays:
[1, 3]
[2, 2]
[3, 1]
However, there are several ways to distribute the elements within these arrays. What I needed was an algorithm that would produce the 7 partitions, not just count their number of elements:
{ 1, 2, 3 }, { 4 }
{ 1, 2, 4 }, { 3 }
{ 1, 2 }, { 3, 4 }
{ 1, 3, 4 }, { 2 }
{ 1, 3 }, { 2, 4 }
{ 1, 4 }, { 2, 3 }
{ 1 }, { 2, 3, 4 }
I have set up the two implementations as they provide two different and complementary results.
The implementation based on B. Djokić algorithm has been changed to a Stream API. It allows filtering the partitions during generation, typically to select only the ones with a specific number of parts, which is the main use case. This change also avoids huge memory consumption as the number of partitions grows up fast (there are more than 10 billions of partitions of a 16 elements set). On my computer, I was not able to go past 12 elements with the version that stored everything, but with the Stream version I could generate the 10 billions partitions for 16 elements (it took about 20 minutes on a desktop computer with an AMD Ryzen 5 3600 processor and 16GB RAM running Linux).
I have also added the computation of Bell numbers, as I used them in the validation tests.
If I'm using "multiple equal objects", i.e.
Arrays.asList(1, 2, 2, 4, 4)).collect(Collectors.toList()
I get "duplicate" entries:
{1,2,2,4,4}
{1,2,2,4}{4}
{1,2,2,4}{4}
{1,2,2}{4,4}
...
Is it possible to restrict the partitions iterator to the same number of partitions?
I don't think it is possible without storing all partitions and filtering duplicates afterwards. The reason is twofold:
One way to do it is post-processing the entries and filtering out.
Just tested with MMA.
The algorithm also seems to take the same combinations again for pattern-matching, if the elements are equal.
SetAttributes[h, {Flat,Orderless}]
ReplaceList[h[a,b,b,d],h[x_, y_,z_ ] :> {x, y ,z}]
But for symbols like Plus
and Times
(which are the typical examples for Flat, Orderless
functions) this is no problem, because argument evaluation of b+b
gives 2*b
and b*b
gives b^2
as "new element" in the evaluated list.
So I think it's OK to try the same combination again in the iterator for this case.
As part of implementing #191, I came across a need to generate some partitions of a set of derivatives, like for example (∂²pᵢ/∂qₗ∂qₘ, ∂pj/∂qₙ), (∂²pᵢ/∂qₘ∂qₙ, ∂pj/∂qₗ), (∂²pᵢ/∂qₘ∂qₗ, ∂pj/∂qₘ) which correspond to all ways to partition the set of three variables qₗ, qₘ, qₙ into two sets.
A generic way to generate arbitrary partitions from a list of objects would be nice.