kedro-org / kedro-starters

Templates for your Kedro projects.
Apache License 2.0
63 stars 57 forks source link

`spaceflights-pandas` tests do not run successfully #221

Open jglev opened 2 months ago

jglev commented 2 months ago

Description

Upon running kedro new --starter=spaceflights-pandas, running python -m pytest will produce two layers of errors, across several versions of Python.

Context

How has this bug affected you? What were you trying to accomplish?

Steps to Reproduce

(This occurs across Python 3.10, 3.11, and 3.12)

  1. Run kedro new --starter=spaceflights-pandas
  2. Run cd spaceflights-pandas
  3. Run pip install -r requirements.txt
  4. Run python -m pytest

Expected Result

All tests that come with the starter should pass without error.

Actual Result

There are two levels of errors:

Level 1: An error occurs, resulting from the tests directory being at project root

The Kedro documentation's Automated Testing page instructs users to run pip install -e .; however, the starter's Readme makes no mention of this. Thus, upon seeing tests and running python -m pytest, users see this error message:

$ python -m pytest
==================================================================================== test session starts ====================================================================================
platform darwin -- Python 3.11.5, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/MyUserName/Downloads/spaceflights-pandas
configfile: pyproject.toml
plugins: mock-1.13.0, anyio-3.7.1, cov-3.0.0
collected 1 item / 1 error
/Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/coverage/control.py:888: CoverageWarning: No data was collected. (no-data-collected)
  self._warn("No data was collected.", slug="no-data-collected")

========================================================================================== ERRORS ===========================================================================================
______________________________________________________________ ERROR collecting tests/pipelines/data_science/test_pipeline.py _______________________________________________________________
ImportError while importing test module '/Users/MyUserName/Downloads/spaceflights-pandas/tests/pipelines/data_science/test_pipeline.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/pipelines/data_science/test_pipeline.py:6: in <module>
    from spaceflights_pandas.pipelines.data_science import create_pipeline as create_ds_pipeline
E   ModuleNotFoundError: No module named 'spaceflights_pandas'
===================================================================================== warnings summary ======================================================================================
venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_configure_node uses old-
style configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_configure_node(self, node):

venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_testnodedown uses old-st
yle configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_testnodedown(self, node, error):

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.5-final-0 ----------
Name                                                            Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------
src/spaceflights_pandas/__init__.py                                 1      1     0%   4
src/spaceflights_pandas/__main__.py                                30     30     0%   4-47
src/spaceflights_pandas/pipeline_registry.py                        7      7     0%   2-16
src/spaceflights_pandas/pipelines/__init__.py                       0      0   100%
src/spaceflights_pandas/pipelines/data_processing/__init__.py       1      1     0%   3
src/spaceflights_pandas/pipelines/data_processing/nodes.py         26     26     0%   1-68
src/spaceflights_pandas/pipelines/data_processing/pipeline.py       4      4     0%   1-7
src/spaceflights_pandas/pipelines/data_science/__init__.py          1      1     0%   3
src/spaceflights_pandas/pipelines/data_science/nodes.py            20     20     0%   1-55
src/spaceflights_pandas/pipelines/data_science/pipeline.py          4      4     0%   1-7
src/spaceflights_pandas/settings.py                                 3      3     0%   27-31
---------------------------------------------------------------------------------------------
TOTAL                                                              97     97     0%

================================================================================== short test summary info ==================================================================================
ERROR tests/pipelines/data_science/test_pipeline.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=============================================================================== 2 warnings, 1 error in 11.80s ===============================================================================

Level 2: KedroContext throws an error

Upon either running pip install -e . or moving the tests directory within src, and then running python -m pytest again, users see a second error:

$ python -m pytest
==================================================================================== test session starts ====================================================================================
platform darwin -- Python 3.11.5, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/MyUserName/Downloads/spaceflights-pandas
configfile: pyproject.toml
plugins: mock-1.13.0, anyio-3.7.1, cov-3.0.0
collected 4 items

tests/test_run.py E                                                                                                                                                                   [ 25%]
tests/pipelines/data_science/test_pipeline.py ...                                                                                                                                     [100%]

========================================================================================== ERRORS ===========================================================================================
__________________________________________________________________ ERROR at setup of TestProjectContext.test_project_path ___________________________________________________________________

config_loader = OmegaConfigLoader(conf_source=/Users/MyUserName/Downloads/spaceflights-pandas, env=None, config_patterns={'catalog': ['ca... '**/parameters*'], 'credentials': ['credentials*',
'credentials*/**', '**/credentials*'], 'globals': ['globals.yml']})

    @pytest.fixture
    def project_context(config_loader):
>       return KedroContext(
            package_name="spaceflights_pandas",
            project_path=Path.cwd(),
            config_loader=config_loader,
            hook_manager=_create_hook_manager(),
        )
E       TypeError: KedroContext.__init__() missing 1 required positional argument: 'env'

tests/test_run.py:23: TypeError
===================================================================================== warnings summary ======================================================================================
venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_configure_node uses old-
style configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_configure_node(self, node):

venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_testnodedown uses old-st
yle configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_testnodedown(self, node, error):

tests/pipelines/data_science/test_pipeline.py::test_data_science_pipeline
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/sklearn/metrics/_regression.py:1187: UndefinedMetricWarning: R^2 score is not well-defined with less than
two samples.
    warnings.warn(msg, UndefinedMetricWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.5-final-0 ----------
Name                                                            Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------
src/spaceflights_pandas/__init__.py                                 1      0   100%
src/spaceflights_pandas/__main__.py                                30     30     0%   4-47
src/spaceflights_pandas/pipeline_registry.py                        7      7     0%   2-16
src/spaceflights_pandas/pipelines/__init__.py                       0      0   100%
src/spaceflights_pandas/pipelines/data_processing/__init__.py       1      1     0%   3
src/spaceflights_pandas/pipelines/data_processing/nodes.py         26     26     0%   1-68
src/spaceflights_pandas/pipelines/data_processing/pipeline.py       4      4     0%   1-7
src/spaceflights_pandas/pipelines/data_science/__init__.py          1      0   100%
src/spaceflights_pandas/pipelines/data_science/nodes.py            20      0   100%
src/spaceflights_pandas/pipelines/data_science/pipeline.py          4      0   100%
src/spaceflights_pandas/settings.py                                 3      3     0%   27-31
---------------------------------------------------------------------------------------------
TOTAL                                                              97     71    27%

================================================================================== short test summary info ==================================================================================
ERROR tests/test_run.py::TestProjectContext::test_project_path - TypeError: KedroContext.__init__() missing 1 required positional argument: 'env'
========================================================================== 3 passed, 3 warnings, 1 error in 14.05s ==========================================================================

The KedroContext documentation states that env should supply a default value of "local", but that seems not to be getting picked up here. Manually adding env="local" here does this error:

$ python -m pytest
==================================================================================== test session starts ====================================================================================
platform darwin -- Python 3.11.5, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/MyUserName/Downloads/spaceflights-pandas/spaceflights-pandas
configfile: pyproject.toml
plugins: mock-1.13.0, anyio-3.7.1, cov-3.0.0
collected 4 items

tests/test_run.py .                                                                                                                                                                   [ 25%]
tests/pipelines/data_science/test_pipeline.py ...                                                                                                                                     [100%]

===================================================================================== warnings summary ======================================================================================
../venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:256: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_configure_node uses old-st
yle configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_configure_node(self, node):

../venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/pytest_cov/plugin.py:265: PytestDeprecationWarning: The hookimpl CovPlugin.pytest_testnodedown uses old-styl
e configuration options (marks or attributes).
  Please use the pytest.hookimpl(optionalhook=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    def pytest_testnodedown(self, node, error):

tests/pipelines/data_science/test_pipeline.py::test_data_science_pipeline
  /Users/MyUserName/Downloads/spaceflights-pandas/venv/lib/python3.11/site-packages/sklearn/metrics/_regression.py:1187: UndefinedMetricWarning: R^2 score is not well-defined with less than tw
o samples.
    warnings.warn(msg, UndefinedMetricWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.5-final-0 ----------
Name                                                            Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------
src/spaceflights_pandas/__init__.py                                 1      0   100%
src/spaceflights_pandas/__main__.py                                30     30     0%   4-47
src/spaceflights_pandas/pipeline_registry.py                        7      7     0%   2-16
src/spaceflights_pandas/pipelines/__init__.py                       0      0   100%
src/spaceflights_pandas/pipelines/data_processing/__init__.py       1      1     0%   3
src/spaceflights_pandas/pipelines/data_processing/nodes.py         26     26     0%   1-68
src/spaceflights_pandas/pipelines/data_processing/pipeline.py       4      4     0%   1-7
src/spaceflights_pandas/pipelines/data_science/__init__.py          1      0   100%
src/spaceflights_pandas/pipelines/data_science/nodes.py            20      0   100%
src/spaceflights_pandas/pipelines/data_science/pipeline.py          4      0   100%
src/spaceflights_pandas/settings.py                                 3      3     0%   27-31
---------------------------------------------------------------------------------------------
TOTAL                                                              97     71    27%

=============================================================================== 4 passed, 3 warnings in 2.13s ===============================================================================

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

Recommendations

  1. Add a Readme note to run pip install -e ., or else move tests to be within src.
  2. Add env="local" to KedroContext in the example test file above.
  3. Optionally, under pyproject.toml's [tool.pytest.ini_options] section, add filterwarnings = ["ignore::DeprecationWarning:.*pytest_cov*"] to suppress the pytest-cov warnings above.

I would be happy to contribute a PR implementing the above, but thought to ask first, would those changes be welcome? Or, specifically as with the KedroContext error above, is it possible that part of this may point to something that needs to be updated in kedro itself?