astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
28.75k stars 933 forks source link

PD rules trigger on non-Pandas DataFrames #6432

Open beskep opened 11 months ago

beskep commented 11 months ago

command: ruff check test.py ruff version: ruff 0.0.282 settings: select = ['ALL']

example:

import polars as pl

pldf = pl.DataFrame()
pldf.pivot()  # PD010 `.pivot_table` is preferred to `.pivot` or `.unstack`; provides same functionality

polars DataFrame provides .pivot() function but no .pivot_table() unlike pandas.

charliermarsh commented 11 months ago

Difficult for us to fully resolve this without a full type inference engine (we could use heuristics, like avoid flagging these rules if polars is imported, but that comes with other problems: you don't have to import Polars in order to access a Polars DataFrame, and just because you import Polars doesn't mean you aren't working with Pandas DataFrames anywhere). Likely won't be fixed in the near-term.

(I'd recommend against using these rules if you're working with Polars.)

MarcoGorelli commented 10 months ago

for a simpler heuristic, would it be possible to check the alias used to instantiate the dataframe? pl.DataFrame rather than pd.DataFrame gives a pretty strong clue that it's not pandas

kleinicke commented 2 months ago

Currently the pandas rules are applied on many non pandas objects. For example PD011 tries to stop you from using .values anywhere, even if you use a library where you should use it. Therefore some kind of check, if the object is even belonging to pandas would be pretty useful.

bje- commented 5 days ago

The same thing happens with the Python DEAP package which has class members named values.