kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.96k stars 903 forks source link

package Kedro project should return `session.run` #2681

Closed noklam closed 2 months ago

noklam commented 1 year ago

The problem with the main() method is that it currently returns an exit code, so downstream processes can't do anything with it.

Clarify this one a bit. I think these are 2 problems

  1. main should return the session.run() - so downstream processing is possible
  2. click generate a sys.exit() by default, and since main wrap around the CLI so it creates some issue on Databricks or even just IPython - this link explains more.

Originally posted by @noklam in https://github.com/kedro-org/kedro/issues/1423#issuecomment-1261071341

https://github.com/kedro-org/kedro-starters/blob/ef88b095119e050161dab5e593f184eaf11b1af0/pandas-iris/%7B%7B%20cookiecutter.repo_name%20%7D%7D/src/%7B%7B%20cookiecutter.python_package%20%7D%7D/__main__.py#L39-L43C10

def main(*args, **kwargs):
    package_name = Path(__file__).parent.name
    configure_project(package_name)
    run = _find_run_command(package_name)
    run(*args, **kwargs)

This currently doesn't return anything

astrojuanlu commented 1 year ago

Related: https://github.com/kedro-org/kedro/issues/2682#issuecomment-1593075949

noklam commented 1 year ago

Pre-requisite: #2682