rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.25k stars 533 forks source link

[FEA] Gracefully accept scikit-learn-specific parameters in estimator APIs as pass-throughs #3461

Open beckernick opened 3 years ago

beckernick commented 3 years ago

Some estimators do not accept all parameters accepted by their corresponding scikit-learn estimator, while others do. For API compatibility, it would be nice to support these arguments to constructors as "pass-through" parameters and perhaps raise a warning or error if they are passed and not None. This additional API compatibility would improve the process of building cuML into downstream libraries and applications.

RandomForestClassifier and Regressor currently take this pass-through approach. cuDF takes a similar pass-through approach to pandas compatibility to enable using some methods with Dask, where these "unsupported" parameters are expected to be None (or the supported default).

Today, Random Forest takes an explicit, hard-coded approach to this task:

https://github.com/rapidsai/cuml/blob/54ea23c33e1d44a8f79413ed13e7607ad6e3e211/python/cuml/ensemble/randomforest_common.pyx#L75-L90

@dantegd were chatting about this offline. Hard-coding to the current API is one way to do this. Another potential way he suggested to broadly enable this for other estimators might be to wrap this concept into a decorator that inspects the corresponding function signature.

beckernick commented 3 years ago

I suspect any solution would also need to innocuously handle the following scenario, which may call down to the Base class depending on the estimator implementation.

clf = Estimator()
clf.set_params(**{"n_jobs": None})
github-actions[bot] commented 3 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.