Closed darijgr closed 10 years ago
I think the bug is in unrank, not in the fact that rings use __getitem__
in a funny way. The problem is that a Cartesian product can easily be constructed from iterables, and then should probably be iterable itself, but iterables are not necessarily indexable:
sage: V={1,2,3}
sage: CartesianProduct(V,V).unrank(2)
TypeError: 'set' object does not support indexing
sage: [i for i in CartesianProduct(V,V)]
[[1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3], [3, 1], [3, 2], [3, 3]]
The function "unrank" should only be applied to indexable cartesian products, i.e., cartesian products of indexables. That's not the same as iterables. In particular,
So the problem is really in tester.some_elements
. Indeed, the offending code:
try:
n = _len(S)
if n > self._max_runs:
from random import sample
S = sample(S, self._max_runs)
except (TypeError, AttributeError):
tries sample
on S
, which apparently leads to calls of the type S[i]
. As you can see, the code is prepared to fail, but doesn't expect that to happen with a ValueError
. I think this just shows that it's a bit callous to just try and index a structure and hope for the best.
Hmm. So Sets are not required to define getitem in a reasonable way? Good to know.
Nicolas Thiéry suggested deprecating R[n]
in favor of R.unrank(n)
at least when R
is a ring (see #8389 comment:13), and I tend to agree. If you don't, could you please leave a comment at #15885?
I agree too if we can do this quickly enough. This is causing problems in #10963...
Branch: u/nthiery/ticket/15919
The above branch takes a first step toward using more systematically the unrank protocol, in that case in CartesianProduct
. This should improve the situation and could be deemed good enough for now, but is certainly not a final solution. Also, the unrank function I introduced might be too paranoid: it makes CartesianProduct.unrank work only with lists, tuples, Parents, xranges; there might be other inputs that are used around. All tests pass though (haven't tried long tests yet).
I agree with Nils though: the sampling in the TestSuite should not be using [i] to do unranking, and probably should just assume that the object is iterable, and not necessarily unrankable. I'll have a look unless someone beats me to it.
The third alternative, explored by Darij in his branch, is to reinstate temporarily the notation FF[i] for unranking for IntegerModRing and friends.
What do you think?
New commits:
a4f7fc3 | 15919: a first step toward using more systematically the unrank protocol |
Changed branch from u/nthiery/ticket/15919 to u/tscrim/ticket/15919
Hey Nicolas,
I've expanded upon what unrank
can accept (all iterables, even if not in enumerated sets, should now work). I agree the sampling should not use [i]
to do unranking, but should we push that to another ticket?
New commits:
5dfb2a2 | Merge branch 'u/nthiery/ticket/15919' of trac.sagemath.org:sage into u/tscrim/ticket/15919 |
9b8d022 | Expanded what unrank can handle. |
d9b5fcd | Added warning about iterator objects. |
Not a review, just a tiny remark: the colon before the doctest in cartesian_product.py
should be doubled.
Branch pushed to git repo; I updated commit sha1. New commits:
2a5eaa5 | Fixed double-colon. |
Hi Travis!
Thanks for the improvement! I made one further step, in particular to avoid trapping too many exceptions; typical situation: a parent L that implements unranking as L[i] and raises an "out of bound" value error.
I am still unhappy about the unrank(ZZ,i)
returning something completely wrong (Integer Ring). In case a parent is iterable, I wonder if I would not prefer to prefer the Iterable strategy, and only use __getitem__
if everything else has failed.
Cheers, Nicolas
Changed branch from u/tscrim/ticket/15919 to u/nthiery/ticket/15919
Replying to @tscrim:
I agree the sampling should not use
[i]
to do unranking, but should we push that to another ticket?
+1. Let's focus here on what's needed for #10963.
New commits:
f9b59fa | Merge branch 'develop=6.2.rc0' into ticket/15919-unrank |
bbd22f1 | - Use `__getitem__` only on sequences and Parents not in EnumeratedSets |
Ah shoot, the sampling issue actually occurs with #10963.
At this point I am tempted to remove the sampling altogether in
InstanceTester.some_elements, and instead have it always return a list
(or iterator?) of the first max_run elements (or less if S
does not
have that many). Then, the only requirement on S
would be to be
iterable.
The code would then just be:
def some_elements(self, S=None):
if S is None:
if self._elements is None:
S = self._instance.some_elements()
else:
S = self._elements
import itertools
return list(itertools.islice(S,0,self._max_runs))
I am now running the tests with #10963 and this change.
Cheers, Nicolas
With just this change (and not the rest of the stuff in this branch), #10963 passes all tests.
At this point, I am tempted to split this ticket into two tickets:
What do you think?
(Much) More of me says this change to InstanceTester.some_elements()
is significantly better (and might even be faster for things that are hard to repeatably iterate over like crystals) than loosing the possibility that (over enough repeated tests) we get "most" elements of a very large finite set (and without calculating __len__()
, which also might be expensive or error catching). So I'm saying let's make this change, the question is where. I'd be happy doing it right here.
I'd say the problem with ZZ
is that it is not in the correct category, which is now #16239. I've also made some very minor doctweaks, so I'm pos_rev with the current state (up to where to make the change above).
Best,
Travis
Changed branch from u/nthiery/ticket/15919 to u/tscrim/ticket/15919
Replying to @tscrim:
(Much) More of me says this change to
InstanceTester.some_elements()
is significantly better (and might even be faster for things that are hard to repeatably iterate over like crystals) than loosing the possibility that (over enough repeated tests) we get "most" elements of a very large finite set (and without calculating__len__()
, which also might be expensive or error catching). So I'm saying let's make this change, the question is where. I'd be happy doing it right here.
Ok, glad to see we are on the same line. I have created #16244 for just this piece. I'll push my little change there later.
I'd say the problem with
ZZ
is that it is not in the correct category, which is now #16239. I've also made some very minor doctweaks, so I'm pos_rev with the current state (up to where to make the change above).
Ok, I am happy with your change. So I guess this can be positive reviewed. There remains to update the description of this ticket to better reflect what has actually been done, and refer to the other ticket for the actual solution of the original issue.
Description changed:
---
+++
@@ -121,3 +121,5 @@
ValueError: first letter of variable name must be a letter
So it seems that both bugs are due to the fancy R[x]
syntax for polynomial rings conflicting with the R[i]
syntax for the i-th element of an enumerated set R
, and #8389 didn't cause the error, but at most exposed it better.
+
+This reworks the unrank
function in sage.combinat.ranker
and uses it for the Cartesian product (this doesn't work for ZZ
, see #16239). In followup tickets, we also want to be more systematic about using unrank in TestSuite
and do better construction of some_elements()
#16244.
Done and done.
Reviewer: Travis Scrimshaw
Author: Nicolas M. Thiéry
Thanks Travis!
The only thing that somewhat worries me is the speed. Before:
sage: T = range(20)
sage: C = CartesianProduct(T,T,T)
sage: %timeit C.unrank(3)
100000 loops, best of 3: 15 µs per loop
After:
sage: T = range(20)
sage: C = CartesianProduct(T,T,T)
sage: %timeit C.unrank(3)
10000 loops, best of 3: 35.5 µs per loop
(These timings are fairly independent on the 20 and the 3...)
I don't know how often this unranking is used, though, and whether these µs add up...
Replying to @darijgr:
The only thing that somewhat worries me is the speed. Before: ... I don't know how often this unranking is used, though, and whether these µs add up...
Ah good point; the isinstance
and in EnumeratedSets
checks cause a little
overhead which is non negligible for inputs that have super fast
unranking (like lists).
My impression is that this unranking is not used much. If it really became a bottleneck, a reasonable fix would be to have CartesianProduct build unranking functions once for all upon the first unrank (I started toying with this in u/nthiery/15919-unrank-unrank-from), so that the above checks would be done only once.
Besides, things should eventually resolve by themselves as CartesianProduct gets progressively replaced by cartesian_product (for which the inputs are parents, or coerced into parents), and parents implement more systematically the unrank protocol.
So altogether, in this balance between overhead and correctness, I would lean here toward correctness. What do you think?
I am not suggesting to go against correctness! My original idea on this ticket was to de-prioritize the notation R[x]
for a polynomial ring or a subalgebra generated by x
, because that notation is really syntactic sugar. So it's speed vs. convenience. But looking at Travis's posts I am no longer sure if this would have fixed all issues. Anyway, since unrank and CartesianProduct aren't very much used together, this is not an issue.
Replying to @darijgr:
I am not suggesting to go against correctness!
Oops, sorry if my wording involuntarily implied this :-) I just meant
that I was ready paying a little speed overhead as a price for a
solution of the problem. And I agree the problem worked on this ticket
(being cleaner with unrank versus __getitem__
) has deviated from the
original one. Not so bad since I believe this + #16244 fixes
reasonably the original problem.
Unless someone raises an objection now, this is good to go!
Thanks to both of you.
Changed branch from u/tscrim/ticket/15919 to 79d4698
This should give a list of 3-arrays of elements of Z/29. Instead I get:
And 29 is the smallest number where this seems to throw this error; also, it doesn't fail if I only take the cartesian product of two copies of the field.
I get a similar error if I do
CartesianProduct(FF, FF).unrank(3)
, and this has been around for a while (not just since beta4):So it seems that both bugs are due to the fancy
R[x]
syntax for polynomial rings conflicting with theR[i]
syntax for the i-th element of an enumerated setR
, and #8389 didn't cause the error, but at most exposed it better.This reworks the
unrank
function insage.combinat.ranker
and uses it for the Cartesian product (this doesn't work forZZ
, see #16239). In followup tickets, we also want to be more systematic about using unrank inTestSuite
and do better construction ofsome_elements()
#16244.CC: @mezzarobba @sagetrac-sage-combinat
Component: algebra
Keywords: notation, algebra, polynomial
Author: Nicolas M. Thiéry
Branch/Commit:
79d4698
Reviewer: Travis Scrimshaw
Issue created by migration from https://trac.sagemath.org/ticket/15919