Open johanrosenkilde opened 8 years ago
+1 to returning a tuple
-1 to returning a tuple. foo.list()
should return the same as list(foo)
, which is always a list, and foo.tuple()
should return the same as tuple(foo)
, which is always a tuple. And for that matter, foo.str()
returns the same as str(foo)
which is always a string, et cetera.
Replying to @kedlaya:
-1 to returning a tuple.
foo.list()
should return the same aslist(foo)
, which is always a list, andfoo.tuple()
should return the same astuple(foo)
, which is always a tuple. And for that matter,foo.str()
returns the same asstr(foo)
which is always a string, et cetera.
This sounds quite reasonable, but the implementation details are quite different: list(...)
and tuple(...)
simply investigate the argument for being iterable and if it is, exhaust the iterator and put the results in a tuple/list. On the other hand, foo.list()
and/or foo.tuple()
ask the object to return a list and/or tuple. It can do this much more efficiently when it's cached. The "safe" option is to return a copy, but that just means incurring the cost of making a copy immediately.
The story for str
is different because that investigates if the object implements the __str__()
protocol, which is a much more straightforward translation (and strings are immutable), so I'll leave that out of consideration here.
I think there is definitely a place for giving efficient access to the constituents of an object in the form of an indexable structure (tuple or list). It seems that it's a bug trap to do this in the form of a mutable data structure (probably because the similarity to list(foo)
, where a fresh list gets constructed automatically), so perhaps a tuple is a better fit.
The method for this has traditionally been called foo.list()
in sage. Changing this to foo.tuple()
is perhaps a little heavy-handed. I think no-one would be looking for this method unless they already now it's what you should look for. Are there other names that are more evocative?
In python3, dictionaries have readonly "views" for their keys,values, and items (although they don't admit indexing). Perhaps the thing we need is something that provides an "indexable view" of the contents of an object. Tuples would work. Lists would too, but are mutable.
I'm not sure how important it is that we specify the exact type that needs to be returned. In pure python this shouldn't be important at all, but in cython you get much more efficient code if something is known to be a tuple or a list.
Types (class names) are nouns:
Methods are verbs
Its perhaps unfortunate that there is no greater selection of appropriate verbs in English, but that linguistic details shouldn't be cause to have some ridiculous bug in Sage.
Definitely +1 on returning something immutable, and a tuple is the natural choice; I have been desiring it for a while. The naming issue is somewhat annoying, but that's life, I guess. The only thing that worries me is that this is going to break backward compatibility, and I am not sure there is anything we can do ...
Alternative idea: create a new type immutable_list
inheriting from list
. This has the advantage of being backwards compatible: isinstance(foo, list)
would still be true and the printing would be like a list. This works and behaves like a list
:
cdef class immutable_list(list):
def __setitem__(self, i, value):
raise TypeError
Of course, you also need to disallow .append()
, +=
, __delitem__
, ... but the basic idea works.
Interesting suggestion. On top of that, we could possibly enable addition with tuples, and raise a warning in case of addition with a list (it's probably not possible to catch []+P.list() though), to pave the path for later changing the output to a plain tuple so that we don't keep around yet another list-like object.
Replying to @nthiery:
Interesting suggestion.
By the way, I'm not totally convinced myself that it's the best solution. It is the best solution to the practical problem of backwards-compatibility. But it does introduce some clutter: yet another list-like type.
Replying to @jdemeyer:
Replying to @nthiery:
Interesting suggestion.
By the way, I'm not totally convinced myself that it's the best solution. It is the best solution to the practical problem of backwards-compatibility. But it does introduce some clutter: yet another list-like type.
But it might be unavoidable: tuples must consist of immutable objects, so it may not always be possible to return one. Another alternative would be for .list()
to return a copy of the cached list, but I'm not sure this is better than jdemeyer's suggestion (it is worse from a performance point of view, since the copy is unnecessary if you don't actually attempt to modify the list).
Replying to @vbraun:
Types (class names) are nouns:
- list
- tuple
Methods are verbs
- list
Its perhaps unfortunate that there is no greater selection of appropriate verbs in English, but that linguistic details shouldn't be cause to have some ridiculous bug in Sage.
A quick look at the Python docs (https://docs.python.org/3/library/stdtypes.html) shows that the following are all methods of basic types:
math.floor()
math.ceil()
int.bit_length()
float.hex()
list.index()
str.title()
str.upper()
dict.keys()
dict.values()
dict.items()
This is definitely a minority of the methods one finds on that page, but it makes the point that insisting that method names be verbs is a good design principle, not an absolute rule; it definitely gets overridden when there is no simple alternative.
For that matter, sometimes one can't even tell: is foo.count()
a noun or a verb? It would be pretty un-Pythonic (and perhaps unfair to non-English speakers) to have list
mean completely different things when interpreted as a noun vs. a verb, in addition to confusing the defined terms list
and tuple
.
Replying to @kedlaya:
But it might be unavoidable: tuples must consist of immutable objects, so it may not always be possible to return one.
That is not true:
sage: A=([1],[2])
sage: A[0].append(3)
sage: A
([1, 3], [2])
That said, I don't think foo.tuple()
would be discoverable at all, so think that's a very bad name for implementing this functionality.
I think it is essential that foo.list()
continues to provide a way to indexably access constituents of an object (although it would be interesting to have data on how much code depends on that). If returning a list for that is too dangerous, I'd be fine with a tuple instead.
If it has to be a list then that would be fine with me. If it's too difficult to let people use copy
when they need to, we may just have to add an option. Something like:
sage: foo.list()
<returns a copied list>
sage: foo.list(tuple=true)
<returns the cached/stored tuple>
sage: foo.list(copy=false)
<returns the cached/stored list>
Concerning verbs/nouns: English is a pretty flexible language. You can verb any noun in it, so without grammatical context you can't really tell what it is anyway. If we want to remove the ambiguity, we should probably rename the method to foo.listify()
.
In fact, I always thought that desctructive method would be verbs (and return None) and methods without side effects would be nouns, as in L.sort()
versus L.sorted()
(except that python, having a decided preference for routines with side effects, doesn't have the latter)
Note that a common design paradigm in python is "we are all consenting adults here" (an argument I haven't seen brought up against using python for pre-university schooling yet). From that perspective, returning a mutable list (perhaps because it must be mutable internally) that shouldn't be mutated isn't an absolute no-go.
Replying to @kedlaya:
math.floor() math.ceil() float.hex() list.index() str.title() str.upper()
This sublist could be all verbs? They are definitely funnier that way.
Replying to @nbruin:
Replying to @kedlaya:
But it might be unavoidable: tuples must consist of immutable objects, so it may not always be possible to return one.
That is not true:
sage: A=([1],[2]) sage: A[0].append(3) sage: A ([1, 3], [2])
Yes, you're right. The only time it matters whether a member of a tuple is mutable is when the tuple is hashed.
So that means that one could cache a tuple and have foo.list()
return a list copy of that tuple, as long as there is no actual legitimate reason to modify the cached list (I certainly can't think of one). In a performance-critical situation, one shouldn't be asking for either a list or a tuple anyway, but rather an iterator.
Replying to @kedlaya:
So that means that one could cache a tuple and have
foo.list()
return a list copy of that tuple, as long as there is no actual legitimate reason to modify the cached list (I certainly can't think of one). In a performance-critical situation, one shouldn't be asking for either a list or a tuple anyway, but rather an iterator.
No, an iterator provides a different kind of interface: if you need indexing an iterator will just not do. Even for iteration an iterator might not be the lowest overhead: you're incurring python call overhead on each access. List/tuple items can be retrieved more efficiently than that (provided the entries have to be in memory anyway). This will be a tiny difference, though.
Replying to @nbruin:
Replying to @kedlaya:
So that means that one could cache a tuple and have
foo.list()
return a list copy of that tuple, as long as there is no actual legitimate reason to modify the cached list (I certainly can't think of one). In a performance-critical situation, one shouldn't be asking for either a list or a tuple anyway, but rather an iterator.No, an iterator provides a different kind of interface: if you need indexing an iterator will just not do. Even for iteration an iterator might not be the lowest overhead: you're incurring python call overhead on each access. List/tuple items can be retrieved more efficiently than that (provided the entries have to be in memory anyway). This will be a tiny difference, though.
If the cached thing ends up being a tuple, then there's no reason not to implement foo.tuple()
returning a reference to the cached tuple. If there is a genuine concern that this is "not discoverable" (which I'm not sure there is for users that know the difference between a list and a tuple), we can modify the docstring for foo.list()
to say SEEALSO: foo.tuple()
.
On a different point, there is also a Python design principle that "tuples are for heterogeneous data, lists are for homogeneous data". In our context, Parent.list()
will be the various members of some mathematical structure, which definitely constitute homogeneous data (they will all be of the same class, for one).
That said, all design principles are secondary to having things behave correctly...
I don't think that "tuples are for heterogeneous data, lists are for homogeneous data" is a Python-specific principle at all. Obviously you want heterogeneous data in tuples, whenever the index has some type/semantic meaning it doesn't make sense to add a new element at the front. But for homogeneous data both list and tuple make sense.
Trying to fix the bug, I did something else in #20902, which also fixes somehow the bug in the current ticket. Please check.
Perhaps I need to clarify how #20902 fixes the issue. With the patch,
sage: F=GF(5)
sage: F.list()
[0, 1, 2, 3, 4]
sage: F.list().remove(0)
sage: F.list()
[0, 1, 2, 3, 4]
sage: F.list.__module__
'sage.categories.finite_enumerated_sets'
As #20902 was merged to Sage 7.5.beta2, the bug reported in this ticket is now resolved. The issue of what should be the right type for the return value of the method .list()
still remains but is controversial. So I suggest to close this ticket and open another when we have a consensus on the issue. If we keep this ticket, then its description needs to be changed at least.
We can just keep this around for right now to note the discussion and the issue.
The current behaviour of
Parent.list()
is the road to insanity:The reason is that the output of
list
is cached, and a list is mutable, and so the user can permanently modify it. This easily happens without wanting to.The closest Python concept of an "immutable list" is a tuple.
CC: @vbraun @jdemeyer @tscrim @nthiery
Component: categories
Keywords: parent, immutable
Issue created by migration from https://trac.sagemath.org/ticket/20743