Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
When I run a packaged Kedro project, the logs get the project name from the directory, which is very weird:
$ cd /tmp
$ python -m test_package_kedro --conf-source ~/.../dist/conf-test_package_kedro.tar.gz
...
[05/30/24 14:16:49] INFO Kedro project tmp session.py:324
[05/30/24 14:16:50] INFO Using synchronous mode for sequential_runner.py:64
loading and saving data.
...
This is more confusing if I'm in a directory that has nothing to do with the project:
$ cd ~/Projects/seaborn
$ python -m test_package_kedro --conf-source ~/.../dist/conf-test_package_kedro.tar.gz
...
[05/30/24 14:14:36] INFO Kedro project seaborn session.py:324
...
Context
The reason I'm reporting this is actually two fold:
(Relatively minor) If I'm running a packaged project, I'd expect the logs to draw the project name from the project itself (for example using the package name, or the distribution name)
Description
When I run a packaged Kedro project, the logs get the project name from the directory, which is very weird:
This is more confusing if I'm in a directory that has nothing to do with the project:
Context
The reason I'm reporting this is actually two fold:
pyproject.toml
from disk when running a packaged project? Because that shouldn't happen as far as I understand. I obverved this while reviewing https://github.com/kedro-org/kedro-plugins/pull/701/, specifically https://github.com/kedro-org/kedro-plugins/pull/701/files#r1618811427Steps to Reproduce
python -m <kedro_project_package_name>
to run it (equivalent ofkedro run
for packaged projects)Expected Result
Are these expectations reasonable? Is there something I'm missing?
Your Environment
pip show kedro
orkedro -V
):python -V
):