astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
33.06k stars 1.1k forks source link

Request: Unsafe fix for `custom-type-var-return-type`/`PYI019` #14183

Closed Avasam closed 1 week ago

Avasam commented 2 weeks ago

https://docs.astral.sh/ruff/rules/custom-type-var-return-type/ I feel like logically a fix for this rule should be feasible without too much complexity? (at least in pyi files and maybe py files above 3.10, see https://github.com/astral-sh/ruff/issues/9761). Useful to migrate codebases still using the old TypeVar way of typing Self. The fix should be unsafe as there's still edge-cases where using a TypeVar is needed (typeshed has some of those). Marking it unsafe may also simplify the logic to apply the fix (if Self already exists in scope)

Avasam commented 1 week ago

CC @InSyncWithFoo

On v0.7.4 running ruff check --select=PYI019 --fix --preview on https://github.com/microsoft/python-type-stubs/ still leaves me with 9 errors. What's making it so these are not autofixable ? I can't find any difference between them and the one that gets fixed.

stubs\sklearn\calibration.pyi:89:10: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
87 |         y: ArrayLike,
88 |         sample_weight: None | ArrayLike = None,
89 |     ) -> _SigmoidCalibration_Self: ...
   |          ^^^^^^^^^^^^^^^^^^^^^^^^ PYI019
90 |     def predict(self, T: ArrayLike) -> ndarray: ...
   |
   = help: Replace with `Self`

stubs\sklearn\cross_decomposition\_pls.pyi:48:75: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
46 |         copy: bool = True,
47 |     ) -> None: ...
48 |     def fit(self: _PLS_Self, X: MatrixLike, Y: MatrixLike | ArrayLike) -> _PLS_Self: ...
   |                                                                           ^^^^^^^^^ PYI019
49 |     def transform(
50 |         self, X: MatrixLike, Y: None | MatrixLike = None, copy: bool = True
   |
   = help: Replace with `Self`

stubs\sklearn\decomposition\_nmf.pyi:71:89: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
69 |         verbose: int = 0,
70 |     ) -> None: ...
71 |     def fit(self: _BaseNMF_Self, X: MatrixLike | ArrayLike, y: Any = None, **params) -> _BaseNMF_Self: ...
   |                                                                                         ^^^^^^^^^^^^^ PYI019
72 |     def inverse_transform(self, W: MatrixLike) -> ndarray | spmatrix: ...
   |
   = help: Replace with `Self`

stubs\sklearn\ensemble\_hist_gradient_boosting\binning.pyi:40:62: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
38 |         n_threads: None | Int = None,
39 |     ) -> None: ...
40 |     def fit(self: _BinMapper_Self, X: MatrixLike, y=None) -> _BinMapper_Self: ...
   |                                                              ^^^^^^^^^^^^^^^ PYI019
41 |     def transform(self, X: MatrixLike) -> ndarray: ...
42 |     def make_known_categories_bitsets(self) -> tuple[ndarray, ndarray]: ...
   |
   = help: Replace with `Self`

stubs\sklearn\feature_selection\_univariate_selection.pyi:55:69: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
54 |     def __init__(self, score_func: Callable | MemorizedFunc) -> None: ...
55 |     def fit(self: _BaseFilter_Self, X: MatrixLike, y: ArrayLike) -> _BaseFilter_Self: ...
   |                                                                     ^^^^^^^^^^^^^^^^ PYI019
56 |
57 | ######################################################################
   |
   = help: Replace with `Self`

stubs\sklearn\gaussian_process\_gpc.pyi:62:10: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
60 |         X: MatrixLike | ArrayLike,
61 |         y: ArrayLike,
62 |     ) -> _BinaryGaussianProcessClassifierLaplace_Self: ...
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PYI019
63 |     def predict(self, X: MatrixLike | ArrayLike) -> ndarray: ...
64 |     def predict_proba(self, X: MatrixLike | ArrayLike) -> ndarray: ...
   |
   = help: Replace with `Self`

stubs\sklearn\linear_model\_glm\glm.pyi:55:10: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
   |
53 |         y: ArrayLike,
54 |         sample_weight: None | ArrayLike = None,
55 |     ) -> _GeneralizedLinearRegressor_Self: ...
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PYI019
56 |     def predict(self, X: MatrixLike | ArrayLike) -> ndarray: ...
57 |     def score(
   |
   = help: Replace with `Self`

stubs\sklearn\linear_model\_ridge.pyi:201:10: PYI019 Methods like `fit` should return `Self` instead of a custom `TypeVar`
    |
199 |         y: MatrixLike | ArrayLike,
200 |         sample_weight: float | None | ArrayLike = None,
201 |     ) -> _RidgeGCV_Self: ...
    |          ^^^^^^^^^^^^^^ PYI019
202 |
203 | class _BaseRidgeCV(LinearModel):
    |
    = help: Replace with `Self`

stubs\sklearn\multioutput.pyi:55:10: PYI019 Methods like `partial_fit` should return `Self` instead of a custom `TypeVar`
   |
53 |         classes: Sequence[ArrayLike] | None = None,
54 |         sample_weight: None | ArrayLike = None,
55 |     ) -> _MultiOutputEstimator_Self: ...
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^ PYI019
56 |     def fit(
57 |         self: _MultiOutputEstimator_Self,
   |
   = help: Replace with `Self`

Found 10 errors (1 fixed, 9 remaining).
InSyncWithFoo commented 1 week ago

The autofix is marked as display-only when the method has "complex" annotations. As this was the initial implementation, I only handled the "bare name" cases.

# Before
def m(self: _S, other: _S, others: list[_S]) -> _S: ...
#                      ^^          ^^^^^^^^ This is not.
#                      This is safe and simple to replace.

# After
def m(self, other: Self, others: list[_S]) -> Self: ...
Avasam commented 1 week ago

They all seem to match what you're saying though, that's what I don't get.

class _SigmoidCalibration(RegressorMixin, BaseEstimator):
    def fit(
        self: _SigmoidCalibration_Self,
        X: ArrayLike,
        y: ArrayLike,
        sample_weight: None | ArrayLike = None,
    ) -> _SigmoidCalibration_Self: ...

class _PLS(
    ClassNamePrefixFeaturesOutMixin,
    TransformerMixin,
    RegressorMixin,
    MultiOutputMixin,
    BaseEstimator,
    metaclass=ABCMeta,
):
    def fit(self: _PLS_Self, X: MatrixLike, Y: MatrixLike | ArrayLike) -> _PLS_Self: ...

class _BaseNMF(ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator, ABC):
    def fit(self: _BaseNMF_Self, X: MatrixLike | ArrayLike, y: Any = None, **params) -> _BaseNMF_Self: ...

class _BinMapper(TransformerMixin, BaseEstimator):
    def fit(self: _BinMapper_Self, X: MatrixLike, y=None) -> _BinMapper_Self: ...

class _BaseFilter(SelectorMixin, BaseEstimator):
    def fit(self: _BaseFilter_Self, X: MatrixLike, y: ArrayLike) -> _BaseFilter_Self: ...

class _BinaryGaussianProcessClassifierLaplace(BaseEstimator):
    def fit(
        self: _BinaryGaussianProcessClassifierLaplace_Self,
        X: MatrixLike | ArrayLike,
        y: ArrayLike,
    ) -> _BinaryGaussianProcessClassifierLaplace_Self: ...

class _GeneralizedLinearRegressor(RegressorMixin, BaseEstimator):
    def fit(
        self: _GeneralizedLinearRegressor_Self,
        X: MatrixLike | ArrayLike,
        y: ArrayLike,
        sample_weight: None | ArrayLike = None,
    ) -> _GeneralizedLinearRegressor_Self: ...

class _RidgeGCV(LinearModel):
    def fit(
        self: _RidgeGCV_Self,
        X: MatrixLike,
        y: MatrixLike | ArrayLike,
        sample_weight: float | None | ArrayLike = None,
    ) -> _RidgeGCV_Self: ...

class _MultiOutputEstimator(MetaEstimatorMixin, BaseEstimator, metaclass=ABCMeta):
    def partial_fit(
        self: _MultiOutputEstimator_Self,
        X: MatrixLike | ArrayLike,
        y: MatrixLike,
        classes: Sequence[ArrayLike] | None = None,
        sample_weight: None | ArrayLike = None,
    ) -> _MultiOutputEstimator_Self: ...

And the typevars are all _SomeClass_Self = TypeVar("_SomeClass_Self ", bound="_SomeClass")

But this got autofixed:

class _BaseHeterogeneousEnsemble(MetaEstimatorMixin, _BaseComposition, metaclass=ABCMeta):
    def set_params(self: _BaseHeterogeneousEnsemble_Self, **params) -> _BaseHeterogeneousEnsemble_Self: ...

Edit: Oh do you mean that no other param must be annotated? (because it's not yet checking if the TypeVar is reused)

In any case, it sounds like there's a follow-up to your first pass (thanks for that). @MichaReiser could this issue be re-openned as it is not yet completed ?