Behaviour of E721 with numpy.dtype

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

https://docs.astral.sh/ruff

MIT License

31.83k stars 1.07k forks source link

Behaviour of E721 with numpy.dtype #9570

Closed nstarman closed 8 months ago

nstarman commented 9 months ago

numpy.dtype is tripping rule E721. I'm not sure it should be.

x = np.array([1,2,3], dtype=float)
x.dtype == float  # <- trips E721

.dtype is an object, e.g. a dtype('float64'). While x.dtype == float may not be the best means of testing a dtype, I don't think this is E721.

charliermarsh commented 9 months ago

What about x.dtype is np.dtype(float)?

nstarman commented 9 months ago

Looking at https://github.com/numpy/numpy/blob/5feea41396910a3065c8e79886127fad4cbdebf3/numpy/__init__.pyi#L844, it appears that == is fine for dtypes.

I suppose we should ask a numpy dev — ping @mhvk — about preferred syntax.

mhvk commented 9 months ago

dtype are a bit of a strange beast, but definitely best thought of as instances, not classes, and they are meant to be comparable not just to their own class, but also to the corresponding scalar types (e.g., x.dtype == np.float32) and strings (e.g., x.dtype == ['i1,i4']; basically, __eq__ always tries to do dtype(other).

charliermarsh commented 8 months ago

Okay, we can try to special-case it when comparing a .dtype attribute directly, though we'll still get it wrong in the general case (since given x == y, we can't know definitively whether x is a dtype).

charliermarsh commented 8 months ago

@mhvk -- Just to clarify, is x.dtype == float is preferable over (e.g.) x.dtype is np.dtype(float) or looking at the dtype kind or char?

zanieb commented 8 months ago

I feel like x.dtype is np.dtype(float) is more correct, but this might make sense to exclude from E721 as it doesn't seem worse to use == float.

mhvk commented 8 months ago

I think it is much safer in general usage to use ==, since is really only works for the basic numpy built-in dtype. np.dtype("i,i") is np.dtype("i,i") gives False, because a new structured dtype is created for every call. So, to the point of this PR, dtype really should not generally be considered classes and thus no preference for is over isinstance or ==.

charliermarsh commented 8 months ago

Okay, sounds good. Thanks for chiming in.