rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.2k stars 525 forks source link

[TRACKER] Ensure that estimator classes do not perform parameter validation on construction #6105

Open csadorf opened 1 week ago

csadorf commented 1 week ago

As required by cuml's development guide, estimator classes must adhere to the API and implementation guidelines put forth by scikit-learn to maximize compatibility.

Those guidelines require that estimators are initialized without any parameter validation or mutation:

There should be no logic, not even input validation, and the parameters should not be changed. The corresponding logic should be put where the parameters are used, typically in fit.

Some of cuml's estimators do not adhere to that standard. This issue tracks checking all of cuml's estimator classes and potentially fixing them.

The following estimators have been checked and are confirmed to not perform any parameter validation or mutation during construction:

The following estimators are identified to either validate and/or mutilate init parameters:

betatim commented 1 week ago

LogisticRegression is another class that performs parameter validation in the constructor.

csadorf commented 1 week ago

LogisticRegression is another class that performs parameter validation in the constructor.

Updated the issue description. I've structured the task lists such that it would be easy to create a sub-task if desired.