python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.43k stars 2.82k forks source link

Allow adding attributes to classes after definition #5363

Closed gwk closed 6 years ago

gwk commented 6 years ago

This is a feature request, unless I'm missing some existing functionality.

As I understand it, the way to annotate class attributes is with typing.ClassVar. However this is incompatible with certain fancy classes like Enum and NamedTuple. For types derived from those, I am able to add class attributes after closing the definition, like so:

class State(Enum):
  OFF = 0
  ON = 1
State.labels: Tuple[str,...] = ('Off', 'On')

In this contrived example, we are attaching related information to the class, rather than defining it as a completely separate top-level variable like State_labels, which I think is a reasonable design choice (encapsulation, etc).

This solution works in Python: labels is indeed a class variable and not an Enum variant. Similarly, a post-definition assignment to a NamedTuple creates a class variable and not a new named field. Sadly, Mypy rejects such code.

I understand that one of Mypy's core principles is to flag suspect attribute mutation, so this behavior is not so surprising. But given that adding attributes post-definition is the only way (that I know of) to add class variables to enums and NamedTuples, I think we should consider loosening the rules in some principled way.

One obvious solution would be to add a special type hint:

State.labels: Tuple[str,...] = ('Off', 'On') # type: add-classvar

Or, Mypy could take the type annotation left of the = as an indication that the programmer is intentionally defining a new attribute. I have not thought about this extensively, but at the moment this approach appeals to me the most.

If this is too loose for the community's tastes, we could tighten the exception to:

I hesitate to suggest these restrictions though, because broadly speaking I think that Mypy should strive to allow for precise annotation of legal Python code, even if the code is not in a "static" style. By allowing the programmer to explicitly indicate that they are adding a type var to an existing class, we will allow Mypy to properly type check a broader range of legal Python programs.

Thanks for considering this. I realize it might be a lot to ask for!

gvanrossum commented 6 years ago

Hm, I'm not at all convinced that assigning post-definition to a class variable is preferable over just using a module-global variable (which you note but reject, quoting "encapsulation, etc") -- and this is before even considering how to add a type annotation.

Your proposed "obvious" solution # type: add-classvar is too complex to implement: since you propose to combine it with a type annotation, it would have to have similar status as # type: ignore, which would require adding it to the typed_ast package.

I wonder if it wouldn't be a reasonable compromise to use a static method if you insist on encapsulation? The Enum class does allow those without creating new values.

gwk commented 6 years ago

@gvanrossum all apologies if my suggestion sounded flippant--I certainly don't mean to imply that any solution would be easy. I meant "obvious" in the sense that it was the first thing I thought of.

Regarding the utility of the pattern: I think that there are strong arguments to be made for using class attributes. As I see it, they exist in the language (and other OO languages) for the related purposes of polymorphism and encapsulation. Perhaps a more convincing example would be of using several enum classes in a data model, each of which has multiple mappings, e.g. to JSON keys and human interface labels:

class IgnitionState(Enum):
  OFF = 0
  ON = 1
  IGNITION = 2

IgnitionState.labels: Tuple[str,...] = ('Off', 'On', 'IGNITION')
IgnitionState.json_keys: Tuple[str, ...] = ('off', 'on', start') # Maybe a legacy API uses a different name.

class EngineState(NamedTuple):
  ignition:IgnitionState
  fuel:float

EngineState.labels: Tuple[str,...] = ('Ignition', 'Fuel Level')
EngineState.json_keys: Tuple[str, ...] = ('ignition', 'fuel')

With these definitions we can then write generic code to both serialize and display the data model that works for all of the participating classes. It is true that we could instead define a global mapping of classes to labels and classes to json keys. But I think that the tradeoffs are sufficiently nuanced that this is a design choice best left to the individual programmer. Is there real value in attaching related metadata to a class? My feeling is that class vars exist for good, broad reasons.

The staticmethod solution you propose is indeed a reasonable compromise, provided that the runtime is optimized not to allocate new tuples every time the methods are called. I do not know if this is the case.

If we can agree that class vars are broadly useful and good, then I think it's reasonable to hope that they work across all classes. Prohibiting their usage for certain kinds of classes just because those kinds overload the syntax would be a disappointment.

Forget my # type: add-classvar suggestion for a moment; what about the idea of just allowing the assignment immediately after the definition? It seems to me that the programmer intent in this case is downright obvious, and it is hard to imagine that the type checker would be preventing any mistakes.

gwk commented 6 years ago

But if this is simply not worth the trouble then I will understand if you just want to close the issue and move on. Thanks taking the time.

gvanrossum commented 6 years ago

"Immediately after the class" is pretty ambiguous.

I'll leave it to others on the project to chime in, but as a feature I think this has relatively benefit compared to the required effort.

JukkaL commented 6 years ago

Yeah, this does not seem important enough to be worth it, considering that the design and implementation seem non-trivial, and there is a workaround (though not perfect).

I can see how a more general way of extending classes outside the class definition might be worth it eventually, if it would allow adding new methods as well (and potentially module attributes/functions). This is something I had plans for very early in mypy development, but few users have requested such a feature. This issue would be solved as a special case.

The more general feature is much more complex to implement than the original proposed feature and would have some unfortunate properties, so it's far from certain that it will ever be implemented.

AivanF commented 3 years ago

This is an old issue, but I consider it actual. I like the proposal of gwk, it looks pretty clear and straightforward. Here is another problem that can be solver the this feature.

Some libraries & frameworks provide many handy yet complex classes, but still don't support typing. A simple workaround is to override the required classes and specify the properties you need, then use your overridden classes instead of those, so the MyPy will comprehend fields types and help coding. However, the replacement of external classes with custom ones is a really huge overhead, and it would be MUCH more simple and straightforward to just attach some fields type hinting to the 3d party classes.

I'm not sure how this types specification should be imported into other files... But I this may be helpful to many other people :)

hauntsaninja commented 3 years ago

A better approach for that kind of issue is writing your own type stubs for those classes.

lilydjwg commented 3 years ago

Hi, I'm assigning subclasses to superclass's attributes, but I can't get mypy accept it.

The proposed workaround doesn't work because there are no classes to be overridden. Also type hints on the superclass doesn't work because mypy no longer recognize the subclasses. I even can't get mypy ignore the classes.

This doesn't work:

class BuildResult:
  def __bool__(self) -> bool:
    return self.__class__ in [self.successful, self.staged]

  def __init_subclass__(cls):
    setattr(__class__, cls.__name__, cls)

  def __repr__(self) -> str:
    name = self.__class__.__name__
    return f'<BuildResult.{name}>'

  successful: successful
  staged: staged
  failed: failed
  skipped: skipped

class successful(BuildResult):
  pass

class staged(BuildResult):
  pass

class failed(BuildResult):
  def __init__(self, exc: Exception) -> None:
    self.exc = exc

  def __repr__(self) -> str:
    name = self.__class__.__name__
    return f'<BuildResult.{name}: {self.exc!r}>'

class skipped(BuildResult):
  def __init__(self, reason: str) -> None:
    self.reason = reason

  def __repr__(self) -> str:
    name = self.__class__.__name__
    return f'<BuildResult.{name}: {self.reason!r}>'
mohkale commented 2 years ago

I'm in a similair situation. I'd like to add a TRACE level to logging and use it like a first class citizen. I've done so using the fix suggested here. In each of my python files I have a logger = logging.getLogger(__name__) line. Within my code I'd like to just have logger.trace("...") and use it just like debug, info, error and warning but I now also have to prefix it a type-ignore comment because mypy can't detect that the trace attribute has been added.