eddiebergman commented 2 years ago

Some of our tests follow the unittest structure of creating tests. While these are fine, we want to migrate our tests to be in the more functional style of pytest.

For example, test_classification which tests the classification pipeline adopts the unittest style of class structured tests in which you must understand the whole setup to often understand one test.

In contrast, a more functional pytest looks like test_estimators.py::test_fit_performs_dataset_compression. Here we can define the test entirely through parameters, specify multiple parameter combinations, document which ones we expect to fail and give a reason for it. Because the test is completely defined through parameters, we can document each parameter and make the test almost completely-transparent by just looking at that one test.

We would appreciate any contributors on this side who want to get more familiar or are already familiar with testing. Some familiarity or willingness to learn pytest's @parametrize and @fixture features is a must. Please see the contribution guide if you'd like to get started!

himanshu007-creator commented 2 years ago

hey, i would like to work on this issue

eddiebergman commented 2 years ago

Hey @himanshu007-creator, that'd be great!

If you have some experience with auto-sklearn or sklearn and want to have a go at trying to convert some of the pipeline components as mentioned in the Issue then that'd be amazing. If you're a bit less familiar with machine learning, we would also appreciate these changes in test/test_metrics.

However, feel free to pick whatever you think is most interesting to you and I'd be happy to provide some guidance. I will give a rough overview of what's the issues are for each major folder in tests.

test_automl - Training models takes time and we repeat this sometimes needlessly. Some tests also end up testing multiple things to avoid this retraining.
test_data - It's fine, doesn't need to be touched
test_ensemble_builder - This will be reworked in the future, the tests can be left as is
test_evaluation - Could do with some pytest'ing.
test_metalearning - Could be cleaned up easily, needs more tests as well but that is a large task of understanding autosklearn's systems.
test_metrics - Could be cleaned up, seems straightforward
test_optimizer - Probably requires more tests, leave it as is
test_pipeline - Majority of work could be done here
test_scripts - Can be left alone for now
test_util - Some refactoring happening here so it can be left alone

himanshu007-creator commented 2 years ago

I would like to go ahead with test_metrics

shantam-8 commented 2 years ago

Hello @eddiebergman, if it is okay with @himanshu007-creator, is it possible if I could start with test_metric? It is my first time contributing to this repo and test_metric just seems to be something that I can start with.

eddiebergman commented 2 years ago

@shantam-8 Please go for it!

I will provide some guidance for it in a few hours :)

Best, Eddie

eddiebergman commented 2 years ago

Sorry I will reply tomorrow morning, some stuff came up!

eddiebergman commented 2 years ago

Hi @shantam-8,

So I took at test_metrics and I have a few pointers.

Thankfully the tests look very straightt forward, they create some dummy data, they create a scorer and then they make sure it gives the correct output.
We want to do the following migration:
```
# From
import unittest
```

class TestScorer(unittest.TestCase):

def one(self):
    ...

To

def test_one() -> None: ...


* You can use the following replacements:
```python
# From
self.assertAlmostEqual(a, b)
self.assertTrue(t)

# To
from pytest import approx
assert a == approx(b)
assert t is True

Anywhere we use something like _ProbaScorer(...), we should instead use the public API way of making a Scorer, i.e. make_scorer(...). This to ensure the public API works as intended. https://github.com/automl/auto-sklearn/blob/ee0916ece79fd8f27059c37acf08b695e2c81ac8/autosklearn/metrics/__init__.py#L242-L253

It would be nice to have a docstring below each test saying what it tests:

def test_sign_flip(...) -> None:
"""
Expects
-------
* Flipping greater_is_better for r2_score should result in flipped signs of
its output. 
"""

If there's any test that seems like it's actually testing 2 things or many things at once, feel free to split them up into seperate tests, use your own judgement and we'll just discuss it during the PR :)

Please feel free to make any other changes that make sense in your judgement, as long as the core thing being tested is still being tested! You should also check out the CONTRIBUTING.md, especially the section on testing to understand how you can run only the tests you care about. Using pytest test/test_metric should be enough but there's more info in the guide if you need.

You can continue asking things here or make an initial PR and we can discuss further there at any point!

Thanks for thinking to contribute :)

Best, Eddie

shantam-8 commented 2 years ago

Really thanks a lot for your help! I shall open a PR asap.

Shubham-M-Rathod commented 7 months ago

Hello @eddiebergman It is my first time with open source and I would like to try my hands on test_evaluation, if this issue is still open.

mhaoda commented 1 month ago

Any help need on this issue?

bencondemi commented 1 week ago

Hi, can I work on this issue? Thanks!

automl / auto-sklearn

Convert some `unittest` tests to `pytest` style #1351

To