Closed ericvsmith closed 6 years ago
A possible option is to add a hidden kw-only argument to generated __init__
, let's call it _root
, which is False
by default. If it is true, then __post_init__
will not be called in generated __init__
. Instead, it can be called manually in replace()
after copying fields:
@dataclass
class C:
x: int
def __post_init__(self):
...
C(x=42, _root=True) # __post_init__ is not called!
We have been using similar pattern for NamedTuple
for some time in typing
, it seems to be OK.
For what it's worth, this is how attrs
behaves, showing they don't do anything with init=False
fields:
import attr
@attr.s
class C:
i = attr.ib()
j = attr.ib(init=False)
def __attrs_post_init__(self):
if self.i == 1:
self.j = 10 * self.i
c = C(1)
print(c)
c1 = attr.evolve(c, i=2)
print(hasattr(c1, 'j'))
print(c1)
Produces:
C(i=1, j=10)
False
Traceback (most recent call last):
File "attr-test.py", line 16, in <module>
print(c1)
File "/cygdrive/c/home/eric/local/venvs/attrs/lib/python3.6/site-packages/attr/_make.py", line 814, in repr_
for name in attr_names
File "/cygdrive/c/home/eric/local/venvs/attrs/lib/python3.6/site-packages/attr/_make.py", line 814, in <genexpr>
for name in attr_names
AttributeError: 'C' object has no attribute 'j'
After giving this a good bit of thought, I'm leaning toward 4 above: drop the post-create step in replace()
. The new design woudl be that replace()
will just create the new object by calling the class's __init__()
with the right params (which will call __post_init__()
as usual), and then return that object. Also, I think it's correct to drop the current requirement that init=False
fields have a default
or default_factory
. This is the approach that attrs
takes.
My reasoning is that it's not dataclass's job to make sure that the created object has all of the fields set, it's the class author's job. If it's possible that a code path through __init__
and __post_init__
leaves a field unset, that's a bug in the class definition. Or maybe it's not a bug, depending on the requirements for that class. Trying to guess the author's intent is just not feasible. Consenting adults and all.
After all, if the class has some wacky creation rules that might leave some fields unset, then that's what __post_init__
is for: you can have your own post-creation verification and/or additional field setting logic.
Closed via #76.
Interesting development. It seems to assume that init=False fields are only useful as fields computed by __post_init__
. But the example from the PEP (a computed area
field) only works for immutable (really "frozen") objects, or if users have the discipline to maintain the invariant manually or call the appropriate setter method. Then again I have no concrete example of why one would use init=False fields anyways.
Good point. I'll change the PEP example to be a frozen class.
It seems to me there's no right behavior for how replace()
should work. Maybe I'll add some wording that if you're writing a class for which the default replace()
behavior doesn't work, you should write your own replace()
member function and publicize that.
Sounds good. I guess init=False fields should be rare.
On second thought, I'm not going to change the example in the PEP, although I've added some wording about init=False
fields.
The reason I'm not going to make it a frozen class is because setting attributes in __post_init__()
for frozen classes is complicated, and I think it takes away from the point of the example. It would become:
@dataclass(frozen=True)
class Square:
length: float
area: float = field(init=False, default=0.0)
def __post_init__(self):
object.__setattr__(self, 'area', self.length * self.length)
Which seems to bury the point of the example.
Sounds good.
The module-level
replace()
function is defined as taking a dataclass instance and a number of replacement values (calledchanges
). A new object is created, passing to its__init__()
the value of fields from the original object or fromchanges
, if they're given.The question then is: what to do with fields that are
init=False
? Since they're not passed in to__init__
, there's no way to specify them on the new instance. Currently I have post-create logic which says that for eachinit=False
field I set the new object's value to the original object's value.This works great, except where the
init=False
field is initialized in__post_init__
. If there is a__post_init__
, then any value set there is overwritten by my post-create logic. For example:Produces:
That is, even though the new object's
j
field is calculated in__post_init__
to be 20, it's overwritten by the post-create logic which copies the original object'sj
value of 10 over the new object's value.The only thing I can think of doing here is to change the post-create loop to say that it should only copy the original object's value if the new object doesn't have an attribute set for the
init=False
field. But I'm not sure how I can detect that, since allinit=False
fields must have a default value, so it actually is set on the new object, even if it's never touched by__post_init__
.Maybe we could just raise an error if
replace()
is called on any dataclass that has aninit=False
field and an__post_init__
method, since there's no way to know what the correct logic should be.So:
init=False
fields have been set in__post_init__
, and if not copy them from the original. Probably not possible.replace()
on a dataclass that has anyinit=False
field and also has a__post_init__
function. This is overly restrictive, but at least prevents the problem outlined here.attrs
does. Also, it doesn't require that aninit=False
parameter has a default value, which adds other wrinkles.Any other approach that I'm missing?
[Edited to add option 4]