pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
27.86k stars 1.71k forks source link

Deprecation decorators messing with type checking #16962

Open m-legrand opened 3 weeks ago

m-legrand commented 3 weeks ago

Checks

Reproducible example

import polars as pl

data = [{"name": "abc", "value": 1}, {"name": "abc", "value": 2}, {"name": "xyz", "value": 3}]
frame = pl.DataFrame(data, schema={"name": str, "value": int})
summary_frame = frame.group_by("name", maintain_order=True).agg(pl.col("value").max())

Log output

No response

Issue description

Using the latest versions of polars, I noticed more and more type hint warnings, leading me to sprinkle my codebase with # noqa's... which I would like to avoid. It seems to come from decorated functions, in particular the ones deprecated for deprecation. Indeed, the function returned by the decorator always has a signature with only one parameter called P, which is in general different from the original function signature.

Screenshots from PyCharm below:

image

image

Expected behavior

The functions shouldn't raise type warnings in PyCharm. A function decorated for deprecation should have the same signature as the same function non-decorated (maybe except for the deprecated parameters..?).

Searching the web for such problems I found the decohints package trying to solve this issue, maybe copying their approach would work..?

Side Note I wonder if it is possible to mimic the standard library modus operandum, at least for deprecated functions in Polars. The hints for these deprecations are very explicit, screenshot & implementation below: ![image](https://github.com/pola-rs/polars/assets/9072763/c2bbe78e-8078-4b23-b42a-9a8d8d7cc965) ```python @classmethod def utcnow(cls): "Construct a UTC datetime from time.time()." import warnings warnings.warn("datetime.datetime.utcnow() is deprecated and scheduled for " "removal in a future version. Use timezone-aware " "objects to represent datetimes in UTC: " "datetime.datetime.now(datetime.UTC).", DeprecationWarning, stacklevel=2) t = _time.time() return cls._fromtimestamp(t, True, None) ```

Installed versions

``` --------Version info--------- Polars: 0.20.31 Index type: UInt32 Platform: Windows-10-10.0.19045-SP0 Python: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: 0.3.3 deltalake: fastexcel: fsspec: gevent: hvplot: matplotlib: 3.9.0 nest_asyncio: 1.6.0 numpy: 1.26.4 openpyxl: pandas: 2.2.2 pyarrow: 16.1.0 pydantic: pyiceberg: pyxlsb: sqlalchemy: 2.0.30 torch: xlsx2csv: xlsxwriter: 3.2.0 ```
m-legrand commented 3 weeks ago

In all transparency, this might be purely a PyCharm problem and that there is nothing Polars can do. But it's an IDE that is used so widely (in particular in the professional world, with PyCharm Professional for Windows) that it's worth investigating IMHO.

I've also raised this issue (and another similar one concerning arithmetic calculations on pl.Series) directly to PyCharm in parallel: https://youtrack.jetbrains.com/issue/PY-73306/Type-hints-fail-for-decorated-functions

m-legrand commented 1 week ago

Hi, just as an update the main issue raised here was a problem with PyCharm, and is solved in their latest build. The other issue is still present, but I suspect this is also a problem on their end.

series = pl.Series(values=[1, 2, 3])
(series - series.mean()).sum()  # tooltip saying: Unresolved attribute reference 'sum' for class 'Decimal'

I'll try to get to the bottom of this with them and will close this issue myself if and when this is solved by their team.