Open Conchylicultor opened 8 months ago
We previously discussed this as something some users might want in https://github.com/python/cpython/issues/88089#issuecomment-1093912596. At that point in time, we decided not to allow multiple inheritance with arbitrary classes, just with typing.Generic
, because:
NamedTuple
is heavily special-cased by type checkers as it is. (Note that I don't consider this a particularly strong argument: the runtime should generally leave it to type checkers to disallow constructs they don't support. It's not the runtime's job to disallow constructs that type checkers don't understand.)If there's a need for using mixin classes with NamedTuple
, I'm open to revisiting this restriction.
One immediate issue I see is what to do about __slots__
. Currently all namedtuple classes have __slots__
set to the empty tuple, which means that instances of namedtuple classes use less memory than instances of classes that don't use __slots__
. But setting __slots__
to the empty tuple is useless unless all base classes also have __slots__
set to the empty tuple. What if somebody tries to use a mixin class like A
in @Conchylicultor's example, that doesn't set __slots__
? Should we raise an exception, on the grounds that this namedtuple class won't come with the memory savings you might expect from using a namedtuple? Or should we assume the user knows what they're doing here, and let it pass?
Thanks for the answer, I understand.
I don't really have answer on the various points you raised. But I can provide more context on my use-case.
In our codebase, we're defining some protocol, to handle saving various object types (Dataset, Model,...):
class Checkpointable(abc.ABC):
@abc.abstractmethod
def __kd_restore__(self, x):
...
class Dataset(Checkpointable):
...
class State(Checkpointable):
...
def restore[T: Checkpointable](path: PathLike, obj: T) -> T:
...
To save multiple object at once, I was trying to add some wrapper:
class TopLevel(typing.NamedTuple, Checkpointable):
state: State
ds: Dataset
The reason I was trying to use named tuple
rather than dataclasses
is because it allow a much more compact syntax in this specific case:
state, ds = restore('/tmp/', TopLevel(state, ds))
vs
out = restore('/tmp/', TopLevel(state, ds))
state = out.state
ds = out.ds
My actual use-case is more complicated with more indirections, variables. But that's the main idea
I updated #31781. It is just removing 4 lines of code which forbid multiple inheritance.
The currently working workaround is to use an immediate class:
class TopLevel(collections.namedtuple('TopLevel', ('state', 'ds')), Checkpointable):
pass
or
class TopLevel(typing.NamedTuple):
state: State
ds: Dataset
class TopLevel(TopLevel, Checkpointable):
pass
Is such complication necessary or we can omit an immediate step?
The issue with __slots__
is solved in the same way as in other cases of using __slots__
or named tuples: you should add __slots__ = ()
in all base classes if you want to get a benefit from __slots__
in child classes.
So, what should we do with this? There were at least two requests for this feature.
@gvanrossum, @JelleZijlstra, @AlexWaygood?
If this works, why not? The implementation seems simple enough. :-) Could have some more tests for normal cases and other edge cases.
But I'd like the typing council to consider this, since all the type checkers also have to support this, and given (IIRC) the special-casing for NamedTuple, that may be more complicated. @erictraut @JukkaL
since all the type checkers also have to support this
Do they? Just because you can do something at runtime doesn't mean that type checkers must support it. There are many things that are allowed at runtime (including when it comes to typing-specific features) that type checkers disallow. To some extent, this is the point of a type checker.
I agree that I'd like the typing council to consider whether this is something type checkers should support. But I actually think whether or not static type checkers should support it is a separate question to whether we should allow it at runtime or not. Type checkers are not and should not feel obligated to support every behaviour of the typing module that works without an exception being raised.
I feel that NamedTuple is special to type checkers -- and in fact it was originally introduced as something you could put in a .pyi file to create a properly typed collections.namedtuple. So I think it would be unfortunate if the same type suddenly acquired a feature that type checkers don't know about.
If they support indirect multiple inheritance with subclasses of NamedTuple (see https://github.com/python/cpython/issues/116241#issuecomment-2098210154), they should not have problems with supporting more direct multiple inheritance. Do they?
I feel that NamedTuple is special to type checkers -- and in fact it was originally introduced as something you could put in a .pyi file to create a properly typed collections.namedtuple.
I could make the same argument about PEP-585 subscriptions, which were a typing-specific feature introduced specifically for typing purposes, but allow many things at runtime that type checkers will complain about:
>>> dict[int, str, bool, bytes]
dict[int, str, bool, bytes]
>>> list[42]
list[42]
If they support indirect multiple inheritance with subclasses of NamedTuple (see #116241 (comment)), they should not have problems with supporting more direct multiple inheritance. Do they?
Both mypy and pyright accept indirect multiple inheritance but reject direct multiple inheritance:
So yes, they would both require at least some modifications in order to support this. I don't know whether that's as simple as just removing a few lines of code that are there to make sure that they complain about the thing that fails at runtime, though, or whether it would require them to rewrite some fundamental assumptions about how NamedTuple
s work.
In the case of pyright, this change would be easy to support. Just a few lines of code to make the current check conditional on the python version. I'm guessing this is true of other type checkers as well.
Edit: This would require a small change to the typing spec, which currently states:
NamedTuple does not support multiple inheritance. Type checkers should enforce this limitation.
I could make the same argument about PEP-585 subscriptions, which were a typing-specific feature introduced specifically for typing purposes, but allow many things at runtime that type checkers will complain about: [...]
That seems to be a wrong analogy (maybe just a strawman?). What you want to do with multiple inheritance here is useful at runtime; the examples you quoted are useless, and we don't promise that those incorrect usages won't become errors at runtime too (they're the result of an implementation shortcut).
But given other feedback I think this is fine, and I will approve the PR next.
What you want to do with multiple inheritance here is useful at runtime; the examples you quoted are useless, and we don't promise that those incorrect usages won't become errors at runtime too (they're the result of an implementation shortcut)
That's a fair point. (Though I'm quite confident there are other examples I could come up with using the typing-module internals of "useful" things that are allowed at runtime but disallowed by the typing spec and type checkers :-)
I cringe every time I see code that uses undocumented parts of typing.py — they are risking breakage but also constraining what we can do.
I cringe every time I see code that uses undocumented parts of typing.py — they are risking breakage but also constraining what we can do.
So do I; I fully agree. That wasn't what I meant (I should have been clearer; apologies). I was thinking more about the fact, for example, that ClassVar
s subscripted by type variables are allowed at runtime despite the spec explicitly prohbiting them; or the fact that Literal[int]
has no meaning to type checkers currently (though it could plausibly be used to mean "exactly the int
class, and not a subclass"), but is allowed. These and other things are things which are allowed at runtime, which could plausibly be useful, and which users might reasonably expect type checkers to support (but which they don't).
>>> from typing import *
>>> T = TypeVar("T")
>>> class Foo(Generic[T]):
... x: ClassVar[T]
...
>>> Foo.__annotations__
{'x': typing.ClassVar[~T]}
>>> Literal[str]
typing.Literal[str]
>>> Literal[int]
typing.Literal[int]
Anyway, this is a pretty minor point. I agree that none of these are an exact analogy to NamedTuple
, where the runtime is already strict and we are considering loosening it.
I don't see this as something that is important to change, though the implementation in type checkers is probably not very hard. My reasoning is that it adds ambiguity about inherited attributes (would they be settable?) and I'm a little worried that there may be additional edge cases that we haven't thought of yet.
Basically I see this as only slightly useful (it hasn't been requested much even though NamedTuple was added years ago), but it makes things more complicated, with a small risk of some tricky or confusing edge cases. If there are some unclear edge cases, we'd probably need to document them as well, and write conformance tests, etc. Then every new user who learns about NamedTuple may have to deal with all these, at least when reading the documentation. I prefer to keep it simple unless there is a compelling reason not to.
I feel that this is important to change. There are only four lines of code stopping multiple base inheritance at runtime, and while I understand removing the current limitation could make type checkers need to do a bit more work, I think that's better than leading people like myself to having to make a copy implementation of NamedTupleMeta
without those four lines or jump through other hoops to achieve the desired result. It encourages bad practices if someone is determined. I understand from reading here that the limitation was imposed at the time NamedTuple
was introduced to adhere to exactly what the specs say, but that's just because no one thought of a reason to allow that at the time. Allowing multiple base classes means if you want some sort of shared functionality between multiple objects, that's now possible, just like other built-in python objects. It feels a bit arbitrary that NamedTuple
doesn't share this feature.
Even if a typing-related change looks small, the effort quickly adds up. There are at least 6 type checkers which could want to implement support for it, since they probably try to follow standards. A small change might take 4 hours to implement (implementation, writing tests, documentation, code review) per type checker. So it could take 24 hours of effort overall, as a very rough approximation. If there are tricky edge cases, it may be much more that this.
Couldn’t we try to phase out NamedTuple in favor of data classes with slots? I still think that NamedTuple should only exist as a quick typed version of collections.namedtuple, not as a feature in itself.
Feature or enhancement
Proposal:
Currently this code is failing:
Raising:
But this code is not:
I believe those 2 snippets are mostly equivalent (the final
B
class behave the same), so it's confusing and force boilerplate thatclass B(A, typing.NamedTuple):
fail.Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs