python / cpython

The Python programming language
https://www.python.org/
Other
60.92k stars 29.41k forks source link

dataclass(a=nan) == dataclass(a=nan) change in truthnes between 3.12 and 3.13 #120645

Open hroncok opened 3 weeks ago

hroncok commented 3 weeks ago

Bug report

Bug description:

from dataclasses import dataclass
@dataclass
class a:
    a: float

nan = float('nan')

Python 3.12.3

>>> nan == nan
False
>>> a(nan) == a(nan)
True

Python 3.13.0b2

>>> nan == nan
False
>>> a(nan) == a(nan)
False

The new behavior kinda makes sense, but it is a behavior change nevertheless.

Was it intentional or accidental? I cannot find anything relevant in https://docs.python.org/3.13/whatsnew/3.13.html

Possibly related to https://github.com/python/cpython/pull/104904 but I have not yet bisected this.

This breaks expectations in the testsuite of cattrs: https://github.com/python-attrs/cattrs/issues/547

CPython versions tested on:

3.13

Operating systems tested on:

Linux

terryjreedy commented 3 weeks ago

Bugfixes and internal changes are not listed in What's New except by reference to the changelob.

I am pretty sure you found the right issue. Built-in collections intentionally compares elements by is before == so that collections are equal to themselves and contain their elements even when they contain NANs.

>>> nan = float('nan')
>>> (nan,) == (nan,)
True
>>> nan == nan
False
>>> nan is nan
True
>>> {nan: nan} == {nan: nan}
True
>>> nan in (nan,)
True
>>> nan in {nan:nan}
True

These considerations do not normally apply to class instances, so I think that 'regression' versus 'bugfix' may be a decision (not by me) rather than a fact. NAN are nasty. But thank you for noticing and raising the issue.

rhettinger commented 3 weeks ago

Roughly, this is the essential difference in the generated code:

class a:

    def __init__(self, a):
        self.a = a

    def old_eq(self, other):
        return (self.a,) == (other.a,)  # Wrap data in a tuple

    def new_eq(self, other):
        return self.a == other.a        # Don't wrap data in a tuple

I don't think the prior behavior of NaNs comparing equal on the basis of identity was an intended or desired behavior. That said, the change in behavior wasn't intended either. NaN identity comparisons weren't considered at all.

Personally, I think the new behavior is the best behavior for dataclasses. The norm for pure python classes is for == to mean whatever __eq__ returns. It matches how most people would write the code by hand.

The old behavior is a bit weird. Think about it. Two distinct dataclass instances compare as equal but their components compare as unequal even though there are no special field modifiers.

There is a case to be made either way. According to IEEE-741, NaNs aren't "supposed to" compare equal even when they are identical. Built-in types such as sets, dicts, lists, and tuples don't follow that rule but most user-defined classes do.

ericvsmith commented 2 weeks ago

Yes, you're probably correct that this is related to #104904. @rhettinger, any thoughts?

rhettinger commented 2 weeks ago

@rhettinger, any thoughts?

If it were up to me, I would leave the code as-is and would add documention, "dataclass instances compare equal when the types exactly match and the corresponding field values are equal".

My rationale is 1) the expected behavior of NaNs is to never compare equal, 2) the typical use case for a NaN is as a placeholder for missing data so there is no basis for saying the values are equal, and 3) optimizing memory use with a singleton NaN ideally shouldn't change what an application does.

Consider comparing two student budgets where neither student knows what their tuition bill is going to be:

student1 = Budget(rent=202.10, tuition=float('nan'), food=45.67)
student2 = Budget(rent=202.10, tuition=float('nan'), food=45.67)

Those budgets cannot be said to be equal because possibly differing data values are missing. Also the result of student1 == student2 shouldn't change with a speed and space optimization that replaces thefloat('nan') instantiations with a math.nan singleton.

hroncok commented 2 weeks ago

As said, the 3.13 behavior indeed makes sense. However, it's a breaking chnage that happened without a deprecation period.

terryjreedy commented 2 weeks ago

That is true of every bugfix. Eric can decide.

hroncok commented 2 weeks ago

That is true of every bugfix.

What bugfix do you mean? This behavior change happened accidentally during an optimization change, didn't it?

ericvsmith commented 2 weeks ago

Unless my imagination is lacking, if we were to back out the optimization and make the change in the next release (or in a few releases from now), we couldn't generate a deprecation warning without drastically slowing down the code. The best we could do is document it. I don't know if anyone would read the deprecation documentation and realize this applies to them.

I'm still mulling it over.

rhettinger commented 2 weeks ago

However, it's a breaking chnage that happened without a deprecation period.

I would describe it differently. It was a change in non-tested, non-guaranteed behavior. Effectively, the downstream test suite was relying on an implementation detail that was not well-known, discussed, documented, tested, or intended.

Going forward, we can document, test and guarantee a definition of dataclass equality. Unlike what we have now, that would allow downstream users to make accurate predictions of what will happen with exotic objects like NaN.

FWIW that will make dataclasses better specified than most other tools in Python (for example, we don't promise exactly what list.sort() will do in the presence of a NaN or with objects that don't define a total ordering).

Unless my imagination is lacking, if we were to back out the optimization and make the change in the next release (or in a few releases from now), we couldn't generate a deprecation warning without drastically slowing down the code.

We make changes in implementation details all the time without going through a deprecation period. That can break overspecified tests but is not considered a "breaking change" (like changing a function signature) . IMO nothing good would come from reverting, deprecating, and reimplementing. It would be unnecessarily disruptive and it would deprive two generations of users of the more reasonable behavior that we all seem to prefer.