Open eddiebergman opened 2 years ago
hey, i would like to work on this issue
Hey @himanshu007-creator, that'd be great!
If you have some experience with auto-sklearn
or sklearn
and want to have a go at trying to convert some of the pipeline
components as mentioned in the Issue then that'd be amazing. If you're a bit less familiar with machine learning, we would also appreciate these changes in test/test_metrics
.
However, feel free to pick whatever you think is most interesting to you and I'd be happy to provide some guidance. I will give a rough overview of what's the issues are for each major folder in tests
.
test_automl
- Training models takes time and we repeat this sometimes needlessly. Some tests also end up testing multiple things to avoid this retraining.test_data
- It's fine, doesn't need to be touchedtest_ensemble_builder
- This will be reworked in the future, the tests can be left as istest_evaluation
- Could do with some pytest
'ing.test_metalearning
- Could be cleaned up easily, needs more tests as well but that is a large task of understanding autosklearn's systems.test_metrics
- Could be cleaned up, seems straightforwardtest_optimizer
- Probably requires more tests, leave it as istest_pipeline
- Majority of work could be done heretest_scripts
- Can be left alone for nowtest_util
- Some refactoring happening here so it can be left aloneI would like to go ahead with test_metrics
Hello @eddiebergman, if it is okay with @himanshu007-creator, is it possible if I could start with test_metric
? It is my first time contributing to this repo and test_metric
just seems to be something that I can start with.
@shantam-8 Please go for it!
I will provide some guidance for it in a few hours :)
Best, Eddie
Sorry I will reply tomorrow morning, some stuff came up!
Hi @shantam-8,
So I took at test_metrics
and I have a few pointers.
# From
import unittest
class TestScorer(unittest.TestCase):
def one(self):
...
def test_one() -> None: ...
* You can use the following replacements:
```python
# From
self.assertAlmostEqual(a, b)
self.assertTrue(t)
# To
from pytest import approx
assert a == approx(b)
assert t is True
Anywhere we use something like _ProbaScorer(...)
, we should instead use the public API way of making a Scorer
, i.e. make_scorer(...)
. This to ensure the public API works as intended.
https://github.com/automl/auto-sklearn/blob/ee0916ece79fd8f27059c37acf08b695e2c81ac8/autosklearn/metrics/__init__.py#L242-L253
It would be nice to have a docstring below each test saying what it tests:
def test_sign_flip(...) -> None:
"""
Expects
-------
* Flipping greater_is_better for r2_score should result in flipped signs of
its output.
"""
If there's any test that seems like it's actually testing 2 things or many things at once, feel free to split them up into seperate tests, use your own judgement and we'll just discuss it during the PR :)
Please feel free to make any other changes that make sense in your judgement, as long as the core thing being tested is still being tested! You should also check out the CONTRIBUTING.md
, especially the section on testing to understand how you can run only the tests you care about. Using pytest test/test_metric
should be enough but there's more info in the guide if you need.
You can continue asking things here or make an initial PR and we can discuss further there at any point!
Thanks for thinking to contribute :)
Best, Eddie
Really thanks a lot for your help! I shall open a PR asap.
Hello @eddiebergman It is my first time with open source and I would like to try my hands on test_evaluation, if this issue is still open.
Any help need on this issue?
Hi, can I work on this issue? Thanks!
Some of our tests follow the
unittest
structure of creating tests. While these are fine, we want to migrate our tests to be in the more functional style ofpytest
.For example,
test_classification
which tests the classification pipeline adopts theunittest
style of class structured tests in which you must understand the whole setup to often understand one test.In contrast, a more functional
pytest
looks liketest_estimators.py::test_fit_performs_dataset_compression
. Here we can define the test entirely through parameters, specify multiple parameter combinations, document which ones we expect to fail and give a reason for it. Because the test is completely defined through parameters, we can document each parameter and make the test almost completely-transparent by just looking at that one test.We would appreciate any contributors on this side who want to get more familiar or are already familiar with testing. Some familiarity or willingness to learn
pytest
's@parametrize
and@fixture
features is a must. Please see the contribution guide if you'd like to get started!