kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.88k stars 895 forks source link

Improve error message when executing `kedro run` without pipeline #3794

Open davidmosca opened 5 months ago

davidmosca commented 5 months ago

Description

After having installed Kedro as per the instructions (https://docs.kedro.org/en/stable/get_started/new_project.html#run-the-new-project), I get an error message when I execute kedro run.

Context

This bug prevents me from completing the installation of Kedro.

Steps to Reproduce

activate the kedro environment: conda activate myenv execute kedro: kedro run

Expected Result

The execution should complete without errors.

Actual Result

The run fails and the the stack trace shows an error message.

(myenv) C:\Users\user1\Documents\GitHub\myproject>kedro run
[04/09/24 15:02:13] INFO     Kedro project myproject                                                      session.py:321Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Scripts\kedro.exe\__main__.py", line 7, in <module>
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\kedro\framework\cli\cli.py", line 198, in main
    cli_collection()
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\click\core.py", line 1157, 
in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\kedro\framework\cli\cli.py", line 127, in main
    super().main(
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\click\core.py", line 1078, 
in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\click\core.py", line 1688, 
in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\click\core.py", line 1434, 
in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\kedro\framework\cli\project.py", line 225, in run
    session.run(
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\kedro\framework\session\session.py", line 346, in run
    filtered_pipeline = pipeline.filter(
                        ^^^^^^^^^^^^^^^^
  File "C:\Users\user1\AppData\Local\anaconda3\envs\myenv\Lib\site-packages\kedro\pipeline\pipeline.py", line 768, in filter
    raise ValueError(
ValueError: Pipeline contains no nodes after applying all provided filters

Your Environment

noklam commented 5 months ago

You need to install the project dependencies to run a project. (which could be pandas, pyspark etc depending what you have selected).

The steps should be

kedro new
cd <project_name>
pip install -r . or pip install -e .
kedro run

If this solve the issue please close the ticket.

noklam commented 5 months ago

I see that the docs are confusing, you need to have a pipeline in order to run kedro run.

kedro new --name=testproject --tools=lint,docs,pyspark --example=n mean you are not selecting any example, thus you have no pipeline/nodes and the error, can you change it to --example=y instead?

davidmosca commented 5 months ago

I see. I don't have a pipeline, I was just following the installation instructions in sequence. The docs would benefit from clarifying that a pipeline is needed. Once I have built one, I will try again. In the meantime, you can close the ticket. Thanks.

astrojuanlu commented 5 months ago

Thanks for opening this issue @davidmosca and sorry you had a bumpy first experience.

I don't think this is only a docs issue: if there are no pipelines defined, kedro run should clearly say so:

> kedro run
No pipelines defined, use `kedro pipeline create` to create one
> echo $status
1
davidmosca commented 5 months ago

Then I don't know. I just followed the installation instructions sequentially with tools 1,2,4,5,7. kedro info returns the expected message.

astrojuanlu commented 5 months ago

If anything, there are 2 issues here:

https://github.com/kedro-org/kedro/blob/2b6f741d21a0c809dd803b0c85d9bd2cf4ce362a/docs/source/get_started/new_project.md?plain=1#L123-L133

noklam commented 5 months ago
You have selected no project tools

To skip the interactive flow you can run `kedro new` with
kedro new --name=<your-project-name> --tools=<your-project-tools> --example=<yes/no>

Consider adding some warning about an empty project.

doshi-kevin commented 1 month ago

Hey is this issue still open, since I want to work on it

astrojuanlu commented 1 month ago

Hi @doshi-kevin , go ahead!

doshi-kevin commented 1 month ago

Hey @astrojuanlu, could you tell me which changes are yet to be applied? Since I see no change in the documentation, I am getting a different error message upon running the command kedro run.