mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3.05k stars 406 forks source link

user warning in test: tests/tests_validation/test_validator_kfold.py::KFoldValidatorTest::test_disable_repeats_when_disabled_shuffle #761

Closed a-szulc closed 2 months ago

a-szulc commented 2 months ago
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-8.3.2, pluggy-1.5.0 -- /home/adas/mljar/mljar-supervised/venv/bin/python3
cachedir: .pytest_cache
rootdir: /home/adas/mljar/mljar-supervised
configfile: pytest.ini
plugins: cov-5.0.0
collecting ... collected 1 item

tests/tests_validation/test_validator_kfold.py::KFoldValidatorTest::test_disable_repeats_when_disabled_shuffle FAILED

=================================== FAILURES ===================================
________ KFoldValidatorTest.test_disable_repeats_when_disabled_shuffle _________

self = <tests.tests_validation.test_validator_kfold.KFoldValidatorTest testMethod=test_disable_repeats_when_disabled_shuffle>

    def test_disable_repeats_when_disabled_shuffle(self):
        with tempfile.TemporaryDirectory() as results_path:
            data = {
                "X": pd.DataFrame(
                    np.array([[0, 0], [0, 1], [1, 0], [1, 1]]), columns=["a", "b"]
                ),
                "y": pd.DataFrame(np.array([0, 0, 1, 1]), columns=["target"]),
            }

            X_path = os.path.join(results_path, "X.data")
            y_path = os.path.join(results_path, "y.data")

            dump_data(X_path, data["X"])
            dump_data(y_path, data["y"])

            params = {
                "shuffle": False,
                "stratify": False,
                "k_folds": 2,
                "repeats": 10,
                "results_path": results_path,
                "X_path": X_path,
                "y_path": y_path,
                "random_seed": 1,
            }
>           vl = KFoldValidator(params)

tests/tests_validation/test_validator_kfold.py:199: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <supervised.validation.validator_kfold.KFoldValidator object at 0x7bab0b929e20>
params = {'X_path': '/tmp/tmpsf5lyvar/X.data', 'k_folds': 2, 'random_seed': 1, 'repeats': 10, ...}

    def __init__(self, params):
        BaseValidator.__init__(self, params)

        self.k_folds = self.params.get("k_folds", 5)
        self.shuffle = self.params.get("shuffle", True)
        self.stratify = self.params.get("stratify", False)
        self.random_seed = self.params.get("random_seed", 1906)
        self.repeats = self.params.get("repeats", 1)

        if not self.shuffle and self.repeats > 1:
>           warnings.warn("Disable repeats in validation because shuffle is disabled")
E           UserWarning: Disable repeats in validation because shuffle is disabled

supervised/validation/validator_kfold.py:28: UserWarning
=========================== short test summary info ============================
FAILED tests/tests_validation/test_validator_kfold.py::KFoldValidatorTest::test_disable_repeats_when_disabled_shuffle
============================== 1 failed in 1.92s ===============================
a-szulc commented 2 months ago

fixed in #768