openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
391 stars 130 forks source link

Pytest failing #499

Closed alanwilter closed 1 year ago

alanwilter commented 1 year ago

I do this in a Ubuntu 20.04 with Python 3.8.10:

git clone https://github.com/openml/automlbenchmark.git
cd automlbenchmark
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install -r requirements-dev.txt
python -m pytest -m "not stress"

============================================================================================================== test session starts ==============================================================================================================
platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0
rootdir: /home/awilter/automlbenchmark, configfile: pytest.ini, testpaths: tests
plugins: mock-3.8.2
collected 135 items / 2 deselected / 133 selected

tests/unit/amlb/datasets/file/test_file_dataloader.py ....F...                                                                                                                                                                            [  6%]
tests/unit/amlb/datasets/openml/test_openml_dataloader.py ..F                                                                                                                                                                             [  8%]
tests/unit/amlb/datautils/test_encoder.py ............                                                                                                                                                                                    [ 17%]
tests/unit/amlb/frameworks/definitions/test_add_default.py ......................................                                                                                                                                         [ 45%]
tests/unit/amlb/frameworks/definitions/test_framework_definition_processing.py .............................                                                                                                                              [ 67%]
tests/unit/amlb/frameworks/definitions/test_load_and_merge_framework_definitions.py ....                                                                                                                                                  [ 70%]
tests/unit/amlb/frameworks/definitions/test_load_framework_definitions.py ..                                                                                                                                                              [ 72%]
tests/unit/amlb/job/test_MultiThreadingJobRunner.py .....                                                                                                                                                                                 [ 75%]
tests/unit/amlb/job/test_SimpleJobRunner.py ...                                                                                                                                                                                           [ 78%]
tests/unit/amlb/resources/test_framework_definition.py ......                                                                                                                                                                             [ 82%]
tests/unit/amlb/utils/process/test_InterruptTimeout.py ........                                                                                                                                                                           [ 88%]
tests/unit/amlb/utils/serialization/test_serializers.py ...............                                                                                                                                                                   [100%]

=================================================================================================================== FAILURES ====================================================================================================================
_________________________________________________________________________________________________ test_load_multiclass_task_with_num_target_csv _________________________________________________________________________________________________

file_loader = FileLoader({})

    @pytest.mark.use_disk
    def test_load_multiclass_task_with_num_target_csv(file_loader):
        ds_def = ns(
            train=os.path.join(res, "iris_num_train.csv"),
            test=os.path.join(res, "iris_num_test.csv"),
            target="class",
            type="multiclass"
        )
        ds = file_loader.load(ds_def)
        assert ds.type is DatasetType.multiclass
        _assert_X_y_types(ds.train)
        _assert_data_consistency(ds)
>       _assert_data_paths(ds, ds_def)

tests/unit/amlb/datasets/file/test_file_dataloader.py:125:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/unit/amlb/datasets/file/test_file_dataloader.py:241: in _assert_data_paths
    assert dataset.train.data_path(f) == path_from_split(s)
amlb/datasets/file.py:188: in data_path
    return self._get_data(format)
amlb/datasets/file.py:207: in _get_data
    self._data[fmt] = converter.convert(self)
amlb/datasets/file.py:399: in convert
    self._write_file(split.data, target_path)
amlb/datasets/file.py:422: in _write_file
    arff.dump(dict(
venv/lib/python3.8/site-packages/arff.py:1091: in dump
    for row in generator:
venv/lib/python3.8/site-packages/arff.py:1028: in iter_encode
    yield self._encode_attribute(attr[0], attr[1])
venv/lib/python3.8/site-packages/arff.py:964: in _encode_attribute
    type_tmp = [u'%s' % encode_string(type_k) for type_k in type_]
venv/lib/python3.8/site-packages/arff.py:964: in <listcomp>
    type_tmp = [u'%s' % encode_string(type_k) for type_k in type_]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

s = 1

    def encode_string(s):
>       if _RE_QUOTE_CHARS.search(s):
E       TypeError: cannot use a string pattern on a bytes-like object

venv/lib/python3.8/site-packages/arff.py:420: TypeError
___________________________________________________________________________________________________________ test_load_regression_task ___________________________________________________________________________________________________________

oml_loader = <amlb.datasets.openml.OpenmlLoader object at 0x7f5c9e8a3dc0>

    @pytest.mark.use_disk
    @pytest.mark.use_web
    def test_load_regression_task(oml_loader):
        fold = random.randint(0, 9)
        ds = oml_loader.load(task_id=2295, fold=fold)  # cholesterol
        assert ds.type is DatasetType.regression
        _assert_X_y_types(ds.train)
        _assert_data_consistency(ds)
        _assert_data_paths(ds, ds._oml_dataset.dataset_id, fold)
>       _assert_cholesterol_features(ds)

tests/unit/amlb/datasets/openml/test_openml_dataloader.py:99:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

dataset = <amlb.datasets.openml.OpenmlDataset object at 0x7f5ca1add610>

    def _assert_cholesterol_features(dataset):
        assert len(dataset.features) == 14
        assert len(dataset.predictors) == 13

        _assert_target(dataset.target, "chol")

        numericals = [p.name for p in dataset.predictors if p.data_type == 'number']
        categoricals = [p.name for p in dataset.predictors if p.data_type == 'category']
        assert len(numericals) == 6
        assert len(categoricals) == 7
        assert len([p for p in dataset.predictors if p.has_missing_values]) == 2

        assert dataset.train.X.dtypes.filter(items=numericals).apply(lambda dt: pd.api.types.is_numeric_dtype(dt)).all()
        assert dataset.train.X.dtypes.filter(items=categoricals).apply(lambda dt: pd.api.types.is_categorical_dtype(dt)).all()
>       assert len(dataset.train.X.select_dtypes(include=['float']).columns) == 2
E       AssertionError: assert 6 == 2
E        +  where 6 = len(Index(['age', 'trestbps', 'thalach', 'oldpeak', 'ca', 'num'], dtype='object'))
E        +    where Index(['age', 'trestbps', 'thalach', 'oldpeak', 'ca', 'num'], dtype='object') =       age  trestbps  thalach  oldpeak   ca  num\n165  57.0     132.0    168.0      0.0  0.0  0.0\n111  56.0     125.0   ...29   40.0     110.0    114.0      2.0  0.0  3.0\n287  58.0     125.0    144.0      0.4  NaN  0.0\n\n[273 rows x 6 columns].columns
E        +      where       age  trestbps  thalach  oldpeak   ca  num\n165  57.0     132.0    168.0      0.0  0.0  0.0\n111  56.0     125.0   ...29   40.0     110.0    114.0      2.0  0.0  3.0\n287  58.0     125.0    144.0      0.4  NaN  0.0\n\n[273 rows x 6 columns] = <bound method DataFrame.select_dtypes of       age sex cp  trestbps fbs restecg  ...  exang oldpeak  slope   ca  thal ...     7  3.0\n287  58.0   1  2     125.0   0       0  ...      0     0.4      2  NaN     7  0.0\n\n[273 rows x 13 columns]>(include=['float'])
E        +        where <bound method DataFrame.select_dtypes of       age sex cp  trestbps fbs restecg  ...  exang oldpeak  slope   ca  thal ...     7  3.0\n287  58.0   1  2     125.0   0       0  ...      0     0.4      2  NaN     7  0.0\n\n[273 rows x 13 columns]> =       age sex cp  trestbps fbs restecg  ...  exang oldpeak  slope   ca  thal  num\n165  57.0   1  4     132.0   0      ...0     7  3.0\n287  58.0   1  2     125.0   0       0  ...      0     0.4      2  NaN     7  0.0\n\n[273 rows x 13 columns].select_dtypes
E        +          where       age sex cp  trestbps fbs restecg  ...  exang oldpeak  slope   ca  thal  num\n165  57.0   1  4     132.0   0      ...0     7  3.0\n287  58.0   1  2     125.0   0       0  ...      0     0.4      2  NaN     7  0.0\n\n[273 rows x 13 columns] = <amlb.datasets.openml.OpenmlDatasplit object at 0x7f5ca1add520>.X
E        +            where <amlb.datasets.openml.OpenmlDatasplit object at 0x7f5ca1add520> = <amlb.datasets.openml.OpenmlDataset object at 0x7f5ca1add610>.train

tests/unit/amlb/datasets/openml/test_openml_dataloader.py:116: AssertionError
============================================================================================================ short test summary info ============================================================================================================
FAILED tests/unit/amlb/datasets/file/test_file_dataloader.py::test_load_multiclass_task_with_num_target_csv - TypeError: cannot use a string pattern on a bytes-like object
FAILED tests/unit/amlb/datasets/openml/test_openml_dataloader.py::test_load_regression_task - AssertionError: assert 6 == 2
================================================================================================= 2 failed, 131 passed, 2 deselected in 31.20s ==================================================================================================

And if I run the test again python -m pytest -m "not stress" only the latter will fail:

FAILED tests/unit/amlb/datasets/openml/test_openml_dataloader.py::test_load_regression_task - AssertionError: assert 6 == 2
PGijsbers commented 1 year ago

all tests pass on master now