Improve some code coverage

eddiebergman commented 2 years ago

This is an ongoing issue we need to work on or would gladly accept contributions for anyone looking to get some experience with testing open-source.

We have some guaranteed randomness with the configuration we tests which causes the code coverage to fluctuate. This causes issue in that we have a simple change such as a docstring change #1349 causes our code coverage test to fail, due to a -0.03% change in coverage.
- We would still like some non-determinism in the configurations we try while testing, therefore we must also have deterministic tests that ensure all parts of our pipeline with all configurations are tested properly.
In general, we are hovering around 88% coverage, these fluctuations are relatively minor and in no way account for the other 12%. I would roughly estimate that we could quite easily reach 5% with ensuring more components of the system are tested. I estimate the remaining 5% is testing the various branches of if/else, error validation and maybe ~1-2% untestable code we do not need to be concerned with (abstract classes etc...)

Please check out our contribution guide, pytest-coverage and our code coverage statistics reported from our automatic unittests if you'd like to get started!

jcob-sikorski commented 2 years ago

Could you elaborate more about parts of pipeline and available configurations?

jcob-sikorski commented 2 years ago

Also what components of the system need the code coverage the most?

eddiebergman commented 2 years ago

Hi @jcob-sikorski,

So in our pipeline, we select different components with different hyperparameters and then evaluate them. The base pipelines folder is where you can see most of the heavy lifting done, with the /components folder being wrappers around existing sklearn components while /implementations is where we had to implement some custom things.

Each component will define it's ConfigurationSpace which is the space of all possible ways to configure the component. During optimization, this will be queried and our underlying optimizer SMAC will slowly optimize towards the best it can find.

Can you access https://app.codecov.io/gh/automl/auto-sklearn/branch/development?

Here you'll see the coverage of the entire codebase (for the development branch). The sunburst diagram plot is probably the best to get a quick overview. The most notable areas of the system lacking coverage is /metalearning with /experimental (autosklearn2) and /ensembles a little bit below that. You can see this nicely with the file explorer part at the bottom of the above page.

You're free to choose wherever you'd like of course but since you mentioned the pipeline I assume you might prefer to look there? Looking at the /pipeline folder, we have 5100 lines of which 4708 are covered and 392 are not. Not all of it is interesting to test though, for example libsvm-svc, the missed lines are sort of hard to test and probably wouldn't bring much value in covering. However something like feature_type has a lot more lines missing and is a lot more integral to the entire system.

Testing these things would require reading around and getting an idea of what's going on, I would be happy to answer any question you have along the way :) Providing a large overview of how the details of the system work might instead be too much.

Best, Eddie

jcob-sikorski commented 2 years ago

How can I take first bite at this? This codebase is huge. Could you propose me some areas where I can start? Where I can find some easy examples of tests?

eddiebergman commented 2 years ago

A simple place that is very standalone would be to provide some tests for the functional.py file. You could create a new test file in autosklearn/test/test_util called test_functional.py and write some tests for that. If you wanted something more related to machine learning then I would advocate again the part above, however it will require knowing much more about what's going on and require some experience with looking through large code bases and trying things out before knowing how to test it.

You can look at test_stopwatch for a simple standalone test file that doesn't require too much outside knowledge.

In any case, your test file might look like:

import pytest

from autosklearn.functional import intersection

def test_intersection_empty_iterables():
    """
    Expects
    -------
    * Doing an intersection on an empty list should be empty
    """
    a = []
    assert intersection(a) == set()

def test_intersection_one_list():
    """
    Expects
    -------
    * I actually don't know what it does, find out and see if it makes sense
    """
    a = ["a", "b", "c"]
    assert intersection(a) == ?

def test_intersection_no_overlap():
    a = {"a", "b", "c"}
    b = {"d"}
    assert intersection(a, b) == set()

def test_intersection_with_overlap():
    a = {"a", "b", "c"}
    b = {"a"}
    assert intersection(a, b) == {"a"}

It would be good to take a look at the functions, try understand what they're doing and then come up with things that would be interesting to test.

There are of course other functions in there and things like roundrobin are a lot more complicated.

You can check the line coverage of your tests using:

pytest --cov test/test_util/test_functional.py

I would also check out the CONTRIBUTING.md which walks you through a lot of the setup for installing for dev and making a PR :)

Best, Eddie

automl / auto-sklearn

Improve some code coverage #1350