Open JukkaL opened 4 years ago
I am not sure we need any general rules here, and can rather decide on case by case basis (like e.g. for dictionary iterators we decided to handle subclasses safely). However, ideally all deviations from CPython behavior should be clearly documented.
I prefer having the general rule for several reasons:
The same arguments arguably apply to special casing builtin functions, to a certain extent. However, monkey patching builtin functions seem very rare and a questionable practice, whereas subclassing builtin classes happens a lot, and even the implementation of mypy does this.
However, monkey patching builtin functions seem very rare...
Besides, monkey patching already much restricted, so users need to learn about monkey patching in any case when starting to use mypyc.
@msullivan Do you have any thoughts about this proposal?
Currently we don't honor overrides for certain operations on primitive types. If a subclass of
list
overrides__getitem__
, for example, we still uselist.__getitem__
for instances of the subclass if the static type islist
. The motivation is improving performance of some of the most common operations.We don't have a detailed policy about this, and it's often unclear when implementing new primitives what to do about subclasses. I propose that we should restrict ignoring overrides to only few specific operations. It would also be to good to document this clearly and have a term that we can use to refer to these operations.
We don't support overrides for any value types, including
int
and fixed-width tuples. This is something we can't change easily, since we actually convert values at runtime and lose the original type. I take this as a given.I think that additionally at least these
list
operations can ignore overrides:__getitem__
withint
index__setitem__
withint
indexlen(listobj)
for x in listobj
(and related, such as withenumerate
)list.append
is another candidate, though for many uses ofappend
we could use static analysis to infer that the concrete type is alwayslist
(unlike the operations above).In particular, I think that
dict
operations and variable-lengthtuple
operations should always support overrides.Discussion:
list
objects and once for subclasses. This would slow down compilation, increase code size, and complicate the compiler.dict
operations, aren't performance-sensitive enough to see significant performance improvements from ignoring overrides.defaultdict
).dict
operation (dict.__getitem__
) already has to deal with overrides anyway (due to stdlibdict
subclasses), so there's not much justification for ignoring overrides in less common operations, in my opinion.list
and overriding basic behavior is not as usual. I found two examples in the stdlib Python code where this happens (outside tests), and one of them was only used internally in a class.[x] * n
usually can be optimized anyway), are less common, or the relatively performance impact is less.list
types such asAny
,Sequence
,MutableSequence
, or the subclass type. These will match Python semantics.Thoughts?