Closed NicolasHug closed 3 years ago
Similarly to https://github.com/scikit-learn/scikit-learn/issues/14215#issuecomment-506878019 might be better to do this with nose2pytest or similar tool, in a few PR by a contributor with some experience.
Yes, I'd rather it be done with an automated tool if possible.
Sunday afternoon, taking refuge from the heat in an air conditioned cafe, taking a stab at this one. Sounds like fun...
Can I work on this @NicolasHug or should I wait for the sprint ? I can do all of them if you want.
@sameshl sorry, we ended up deciding not to included this as sprint issues since it can be less error-prone doing this automatically
@adrinjalali I can't remember whether you opened a PR for this?
Do we allow "cleaning" the estimator checks? We have a soft dependency on pytest and that would create a hard dependency on pytest for downstream packages that want to call check_estimator
. I would be in favor of doing that but we should make that consciously.
@NicolasHug I will try doing it with nose2py or some automated tool then.
@sameshl IIRC the ones left are not covered by nose2py
. I've already applied that tool, the rest need to be done manually, I think.
@adrinjalali I am ready to take it up and do it manually.
go ahead then :)
I am starting with cluster/tests
@thomasjpfan This issue shouldn't close. I will complete the rest too.
@sameshl It automatically got closed because you said "fixes xxxx" in your PR. You should instead say "related to xxx", or "towards xxx"
@adrinjalali Yeah, It was my bad. Thanks for pointing it out!
@NicolasHug Could you please update the checklist for this issue. The following have been completed:
sklearn/compose/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14670)sklearn/covariance/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14674)sklearn/datasets/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14676)sklearn/decomposition/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14679)sklearn/feature_extraction/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14694)sklearn/feature_selection/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14697)sklearn/manifold/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14699)Thanks for the list, done @sameshl
are we splitting this up for ease of reviewing? This can be done with a regex, right? @sameshl are you using a regex or doing this manually?
oh or with nose2pytest?
IIRC I did apply nose2pytest to the whole codebase and it's already in. So the leftovers need to be done in a different way.
Yeah nose2pytest doesn't support assert_raises, from the readme,
Some Nose functions can be handled via a global search-replace, so a fixer was not a necessity:
assert_raises: replace with pytest.raises
and
Some Nose functions don't have a one-line assert statement equivalent, they have to remain utility functions:
assert_raises_regex
But it still should be possible to do it with a regex I think?
Their docs say:
Some Nose functions can be handled via a global search-replace, so a fixer was not a necessity:
assert_raises: replace with pytest.raises
assert_warns: replace with pytest.warns
And
Some Nose functions don't have a one-line assert statement equivalent, they have to remain utility functions:
assert_raises_regex
assert_raises_regexp # deprecated by Nose
but I'm pretty sure these can be done with a regex.
lol @rth
are we splitting this up for ease of reviewing?
Yes. I thought it might be easier this way.
This can be done with a regex, right? @sameshl are you using a regex or doing this manually?
I am not exactly sure about how I might do that with regex. So I am doing it with a couple of vim macros I made myself.
Just a note here. We can also replace assert_raise_message
. This is the same deal.
@adrinjalali Another update to the checklist of completed :
sklearn/cluster/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14649)sklearn/metrics/cluster/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14707)sklearn/metrics/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14715)sklearn/neural_network/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14716)sklearn/preprocessing/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14717)sklearn/semi_supervised/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14841)sklearn/svm/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14727)sklearn/tree/tests/
(https://github.com/scikit-learn/scikit-learn/pull/14737)Another update to the checklist of completed :
Done. Thanks!
Note that sklearn/utils/estimator_checks.py
should be excluded, as we can't import pytest therE.
I am working on utils/tests/
.
Me and @feras-oughali will be working on sklearn/ensemble/tests/
Me and @stevekola are working on sklearn/neighbors/tests/
Me and @cycks are working on sklearn/model_selection/tests/
Me and @abdulelahsm are working on `
sklearn/linear_model/tests/`
I am working on sklearn/tests/test_dummy.py with @marenwestermann
Running git grep "assert_raises" sklearn/tests/
yields many matches in subfiles. Maybe you want to open a PR just for one such specific subfile first.
working on sklearn/tests/test_base.py
One question can be should we also replace assert_warning_message and assert_no_warnings with pytest.warns(..) syntax instead. Details.
Yes, you can replace them as well. Though if the resulting diff is already large, better to do it in a separate PR.
@NicolasHug Can you please update the checklist: sklearn/tests
is now completed.
Looks like all modules in my original message have been done. @jeremiedbb , you re-opened this issue recently: was it just waiting for #19864 to be merged, or are there more modules that need some cleaning?
git grep assert_raises
still finds occurences (excluding estimator_checks). Same for assert_raises_regexp
and assert_raise_message
. I guess some PRs pretended to clean a whole module while only cleaning a subset.
Would you like to keep those for another sprint or should we keep cleaning the remaining?
Also, it seems like sklearn/mixture
is not included in the original list. Was that intentional?
I think we can just get these done now. I upadted the list with some more stuff
+1 with @NicolasHug opinion
@NicolasHug The remaining two in sklearn/tests
are:
sklearn/tests/test_base.py
: commented out todo that shows a future recommendation for testssklearn/tests/test_isotonic.py
: the name of the test is test_assert_raises_exceptions
I think they can be removed from the checklist.I'm working on sklearn/metrics/cluster/tests/test_unsupervised.py
.
Btw: sklearn/metrics/cluster/tests/test_unsupervised.py
appears twice in the list at the top.
All of the remaining open points should be closed when #19999 and #20104 are merged.
@NicolasHug The following files have been cleaned on #20104:
Based on the PR from #20065, test_unsupervised can also be marked as done because only the function names have assert_raises.
@NicolasHug
I also don't find any left overs. As discussed, the exception is sklearn/utils/estimator_checks.py
(and dependendies) where we do not want to depend on pytest.
A Great Thank You to all contributors!
As explained above we should still remove assert_raises
etc. from sklearn/utils/tests/test_estimator_checks.py
, but use the custom sklearn/utils/_testing.raises
util instead of pytest
.
~(Saving this for the upcoming sprints, ideally)~
Let's remove the use of
assert_raises
,assert_raise_message
, andassert_raises_regex
.These should be replaced with the pytest context manager:
(no need for
match
in the case ofassert_raises
andassert_raise_message
I guess).For contributors: pick one of the modules below, and please comment on this issue saying e.g. "I'm working on cluster/tests", to avoid other contributors choosing the same modules.
You can see all the occurrences of the entries that need to be removed with e.g.
git grep "assert_raises" sklearn/ensemble/tests/
.Modules that need cleaning
sklearn/cluster/tests/
#14649)sklearn/compose/tests/
#14670sklearn/covariance/tests/
#14674sklearn/datasets/tests/
#14676sklearn/decomposition/tests/
#14679sklearn/ensemble/tests/
#19399sklearn/feature_extraction/tests/
#14694sklearn/feature_selection/tests/
#14697sklearn/linear_model/tests/
#19440sklearn/manifold/tests/
#14699sklearn/metrics/cluster/tests/
#14707sklearn/metrics/tests/
#14715sklearn/model_selection/tests/
#19592sklearn/neighbors/tests/
#19388sklearn/neural_network/tests/
#14716sklearn/preprocessing/tests/
#14717sklearn/semi_supervised/tests/
#14841sklearn/svm/tests/
#14727sklearn/tests/
#19500sklearn/tree/tests/
#14737sklearn/utils/estimator_checks.py
sklearn/utils/tests/
#16337Some more:
see #20065sklearn/metrics/cluster/tests/test_unsupervised.py
sklearn/compose/tests/test_column_transformer.py
sklearn/covariance/tests/test_robust_covariance.py
sklearn/datasets/tests/test_openml.py
sklearn/datasets/tests/test_samples_generator.py
sklearn/decomposition/tests/test_nmf.py
sklearn/decomposition/tests/test_factor_analysis.py
sklearn/feature_extraction/tests/test_text.py
sklearn/linear_model/tests/test_bayes.py
sklearn/linear_model/tests/test_sag.py
sklearn/linear_model/tests/test_ransac.py
sklearn/manifold/tests/test_locally_linear.py
sklearn/metrics/cluster/tests/test_unsupervised.py
sklearn/mixture/tests/test_bayesian_mixture.py
sklearn/mixture/tests/test_gaussian_mixture.py
sklearn/svm/tests/test_bounds.py
sklearn/svm/tests/test_sparse.py
sklearn/svm/tests/test_svm.py
sklearn/tests/test_base.py
sklearn/tests/test_isotonic.py
Also:
sklearn/utils/tests/test_estimator_checks.py
but this should use the customraises
CM insklearn/utils/_testing.py
instead, as we don't want to use pytest for this file