Closed johnthagen closed 7 years ago
I like this idea in principle; it's a good example of boilerplate generation that the library can do for you.
It looks like there's a flaw in the implementation sketch though -- it seems it would happily translate e.g.
@dataclass(immutable=True)
class FakeNews:
news: List[str]
into a class like
class FakeNews:
def __init__(self, news: List[str]) -> None:
self._news = news
@property
def news(self) -> List[str]: return self._news
but this would not be immutable by the definition that's typically used. (E.g. a FakeNews item could not be used as a dict key, since it's not hashable.)
Hmm, good point. This feels somewhat like "interior mutability vs exterior mutability" from my time in Rust. In this case, you still get the protection that your FakeNews
instances will always point to the same news list
, even though that list
could be modified.
I'm curious, if FakeNews
took a tuple
, would it be hashable?
Would a different (weaker) parameter name help with this? I feel like immutable
makes it clear the intent, but do agree that in Python it's difficult to satisfy in a pure sense. I suppose it's possible we could disallow all mutable types from being used as attributes in immutable=True
data classes. But then we might have to recurse into user defined types and it'd probably get hairy quickly.
I'd still find a very nice use for the non-pure "immutable" data types, personally. The amount of boilerplate currently needed for it generally pushes me away from it, which is unfortunate.
When I was programming in Scala, we usually referred to case classes (Scala's term for data classes) as "immutable", just not "fully immutable", if they had members that were mutable objects because you couldn't reassign the members. I wonder if its a C-world vs Java-world distinction.
An immutable dataclass
could generate a hash code automatically:
def __hash__(self):
return hash((self._name, self._unit_price, self._quantity_on_hand))
This would correctly fail if any of the members were not hashable.
TBH how does this differ from frozen=True
on the class?
frozen=True
prevents add new attributes after construction:
my_item = InventoryItem(name='pizza', unit_price=6.99, quantity_on_hand=5)
my_item.absurb_headline = "Python attacks!" # Can't add new attributes
immutable=True
prevents attributes from being modified after construction:
my_item = InventoryItem(name='pizza', unit_price=6.99, quantity_on_hand=5)
my_item.name = "Guido" # AttributeError thrown. Guido is not for sale.
Both are important, but orthogonal. If I were using data classes, I'd often set both to True
.
IIUC frozen=True also prevents mutating existing attributes, and the effect it has on new attributes is incidental.
@gvanrossum : as far as "immutable", I don't see this case as different from:
>>> t = (1, [], 3)
>>> t[1].append(2)
>>> t
(1, [2], 3)
>>> {t:0}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
I think everyone would agree that a tuple is immutable.
I'm not sure I see much difference between the existing frozen=True and the proposed immutable=True (except maybe a performance difference). The intent of frozen=True is to disallow you from assigning to instance fields. If it also prevents you from assigning to non-field attributes (or creating new non-fields), that's okay with me.
@ericvsmith You're correct, frozen=True
already does this. It wasn't clear to me from the PEP that this was how it was designed to work (perhaps a short example would help other readers?)
I installed dataclasses (0.1)
from PyPI with pip
on Python 3.6.
And ran this code:
from dataclasses import dataclass
@dataclass(frozen=True)
class Pizza:
name: str
def main() -> None:
p = Pizza(name='pizzzza')
print(p.name)
p.name = 'new'
print(p.name)
if __name__ == '__main__':
main()
Correctly throws:
dataclasses.FrozenInstanceError: cannot assign to field 'name'
No squiggles in PyCharm yet, but I'm sure they'll teach it about dataclasses
when it's official.
frozen=True
does do what I had wanted. Thanks for the explanation.
If someone wants to create a data class in which all instances are immutable (i.e. each attribute can not be changed after construction), I propose that a
immutable
parameter be added (which in the spirit of Python defaults toFalse
). Note this is different thanfrozen
, which applies to monkey patching new attributes.Currently, this can be done manually with normal classes with a lot of boilerplate and the use of
@property
. In other languages, such as Kotlin, data classes are immutable by default.A sketch of this proposal would be as follows:
Would desugar into something like:
If one attempts you modify a
property
, anAttributeError
is raised. IDEs can lint for this kind of thing while the user types before runtime. PyCharm, for example, squiggles a warning if you try to set aproperty
.