Open maxale opened 3 weeks ago
If I made no mistake, Partitions(n, max_length=m).cardinality()
and Partitions(n, min_length=m).cardinality()
actually both compute the number by generating the partitions. There is no dedicated class for partitions of given maximal length. The reason for that is, very likely, that there are too many combinations of possible parameters, and nobody cared enough for this one yet. I do think that we should have it, because of $GL_n$.
We do have a dedicated cardinality function for partitions of given length, though.
sage: n=50; m=10
sage: %time sum(Partitions(n, length=i).cardinality() for i in range(m+1))
CPU times: user 114 ms, sys: 88.8 ms, total: 202 ms
Wall time: 221 ms
62740
sage: %time Partitions(n, max_length=m).cardinality()
CPU times: user 3.32 s, sys: 0 ns, total: 3.32 s
Wall time: 3.32 s
62740
PS: I appreciate your reports a lot!
It's unfortunate that .cardinality()
in these cases are computed via an actual generation of partitions. I think there should be an efficient implementation for "base" cases with max_length=
or max_part=
or both specified.
Then the cardinality for length=k
can be computed as the difference of those for max_length=k
and max_length=k-1
, the cardinality of min_length=k
can be computed via complementation, and same for min_part=k
. This will cover most useful cases in my view.
On a related note, I think Sage should issue a warning (at verbosity = 0 or 1 level) whenever it knowingly uses a non-efficient algorithm (such as enumeration via generation). I use to believe that whenever a certain functionality exists it implements the best available algorithm. Knowing when this is not the case will greatly help to find bottlenecks and avoid using slow algorithms when performance is an issue.
Any combinations of any subset of length
, min_length
, max_length
, min_part
should now be reasonably fast. I did stick to the current scheme (i.e., relating all computations to those of given size and length), that looked easier.
On a related note, I think Sage should issue a warning (at verbosity = 0 or 1 level) whenever it knowingly uses a non-efficient algorithm (such as enumeration via generation). I use to believe that whenever a certain functionality exists it implements the best available algorithm. Knowing when this is not the case will greatly help to find bottlenecks and avoid using slow algorithms when performance is an issue.
I think verbosity is hard to maintain. However, I use %prun
and %lprun
a lot, do you know about these? In the case at hand:
sage: %prun -s cumulative Partitions(30, min_length=10).cardinality()
18655 function calls (18645 primitive calls) in 1.355 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
2/1 0.001 0.001 1.354 1.354 {built-in method builtins.exec}
1 0.000 0.000 1.349 1.349 <string>:1(<module>)
1 0.003 0.003 1.349 1.349 finite_enumerated_sets.py:97(_cardinality_from_iterator)
2545 1.321 0.001 1.346 0.001 lists.py:225(_element_iter)
2544 0.006 0.000 0.025 0.000 lists.py:281(_element_constructor_default)
2544 0.005 0.000 0.018 0.000 partition.py:518(__init__)
2544 0.008 0.000 0.012 0.000 combinat.py:1526(__init__)
2544 0.004 0.000 0.004 0.000 combinat.py:1111(__init__)
1 0.000 0.000 0.003 0.003 partition.py:6114(__classcall_private__)
...
@mantepse: thank you for the improvements! A nice testcase is the Sage code that I gave https://oeis.org/A333148 - how fast can it compute terms for n
, say, around 1000?
Apparently, I missed a case, because Partitions(m, max_part=l, length=k).cardinality()
is still iterating over all partitions.
@maxale, I have now coded that, too. It is not completely satisfying, because if you call Partitions(args).cardinality()
, and awful amount of time is spent in initialising the iterator facility.
With some effort, one can bypass the problem as follows:
from sage.combinat.partition import number_of_partitions_length_max_part
def P(n, a, b, k, m):
"""
Return the number of partitions of n with parts between a and b and length between k and m.
sage: def D(n):
....: s1 = number_of_partitions(n)
....: s2 = sum(P(m, 0, l, k, k)
....: * P(n-m-l^2, 0, Infinity, k+2*l, Infinity)
....: for l in range(1, (n+1).isqrt())
....: for m in range((n-l^2-2*l)*l//(l+1)+1)
....: for k in range(ceil(m/l), min(m, n-m-l^2-2*l)+1))
....: return s1 + s2
"""
n = ZZ(n)
if a <= 0: a = 1
if b > n: b = n
if k < 0: k = 0
if m > n: m = n
# print(n,a,b,k,m)
if not k and m == n and a == 1:
# unrestricted length, parts smaller b
return ZZ.sum(number_of_partitions_length(n, i) for i in range(b + 1))
return ZZ.sum(number_of_partitions_length_max_part(n - (a - 1)*l, l, b - a + 1)
for l in range(k, m+1))
Doing so, D(300)
takes 10 seconds on my computer, D(350)
takes 15 seconds.
@mantepse: Thanks for checking. My hope is to have an efficient and more or less universal .cardinality()
method that will allow to hide all technicalities. As for the slowness with an iterator construction - can it be done only when an actual iteration starts? Always creating an iterator is a bad design decision to my taste.
@mantepse: Thanks for checking. My hope is to have an efficient and more or less universal
.cardinality()
method that will allow to hide all technicalities.
Yes, that's exactly what the clas Partitions
(should) provide. It is not all that easy, though, one may want to choose different algorithms for different sizes of parameters, and there may be a space / time tradeoff.
As for the slowness with an iterator construction - can it be done only when an actual iteration starts? Always creating an iterator is a bad design decision to my taste.
You are quite right. I guess the solution might be that Partitions
should not inherit from IntegerListsLex
. I took that for granted, but it is actually not necessary.
@mantepse: Could you please submit a ticket on restructuring Partitions
class to make it lazy with respect to iterator machinery?
@mantepse: Could you please submit a ticket on restructuring
Partitions
class to make it lazy with respect to iterator machinery?
I have done this already on the pull request.
@mantepse: I'm confused since earlier you said
It is not completely satisfying, because if you call
Partitions(args).cardinality()
, and awful amount of time is spent in initialising the iterator facility.
Do you mean that this issue was addressed in a new pull request, or what is the current situation?
It is in the last commit: f8497e93f212f5c4514fced3ffd3b038a0a7ebb3
Problem Description
Check this out:
Proposed Solution
As the code example shows, computing cardinality with
min_length=
appears to be much slower than that withmax_length=
. Hence, it'd be beneficial to convert one into the other by complementation (in the set of all partitions) as illustrated by the example.Alternatives Considered
I believe the cardinalities in both cases can be computed via dynamic programming, but I did not check if Sage uses it wisely and why in one case it's much slower than in the other.
Additional Information
No response
Is there an existing issue for this?