rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.25k stars 534 forks source link

TST Add basic common test infrastructure #6107

Open betatim opened 1 month ago

betatim commented 1 month ago

xref #6105

This adds a test that uses the test infrastructure that scikit-learn provides to find out if an estimator is compatible with scikit-learn. Short explanation https://scikit-learn.org/dev/developers/develop.html#rolling-your-own-estimator and we use https://scikit-learn.org/dev/modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks here to have a test that is parametrized by estimator and check. This makes it easy to see what is failing.

I've only added LogisticRegression for the moment, mostly because I wanted to see what other people think of this. I think it would be useful for issues like #6105 to find out which estimators are/aren't compatible, fix them and make sure they stay compatible. Fixing some of the failures would probably make it easier for users to swap scikit-learn and cuml without big effort. A downside is that right now a lot of the checks fail.

WDYT?

betatim commented 1 month ago

It now iterates over all (or at least most?) estimators in cuml and checks them. Unfortunately a lot of the tests fail and locally I even get a rmm error which crashes pytest.