kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.93k stars 903 forks source link

Add "add-ons" flow to `kedro new` CLI command #2850

Closed merelcht closed 1 year ago

merelcht commented 1 year ago

Description

Follow up on https://github.com/kedro-org/kedro/issues/2758

Context

Update the kedro new command to allow users to select "add-ons" from:

  1. testing
  2. linting
  3. logging
  4. data structure
  5. documentation

Possible Implementation

Use cookiecutter boolean variables to prompt users to select which add-ons they want.

Possible Alternatives

N/A

amandakys commented 1 year ago

The flow for the new kedro new would look something like this

(my-virtual-environment) ➜  kedro new 

Project Add-Ons 
================
Here you can select which add-ons you'd like to include. 
Don't worry if you change your mind you can always add/remove these later.
To read more about these add-ons and what they do visit: kedro.org/{insert-documentation}

Add-Ons 
1) Linting : Provides a basic linting set up with Flake8, Black and isort 
2) Testing : Provides basic testing set up with pytest 
3) Logging : Provides more logging options, environment specific,  
4) Documentation: Provides basic documentations setup with Sphinx
5) Data Structure: Provides a directory structure for storing data 

Which add-ons would you like to include in your project? [1-4/all/1,3]: 

Project Name
============
Please enter a human readable name for your new project.
Spaces, hyphens, and underscores are allowed.

 [New Kedro Project]: My ML pipeline 

Adding the corresponding CLI commands and flags is outside the scope of this ticket. This work has been written up in other tickets:

2865 kedro new --name=<project-name>

2873 kedro new --addons

merelcht commented 1 year ago

Questions raised in grooming:

  1. When does the default kick in for the add-ons, is there any scenario in which the user doesn't go through the prompts e.g. a config file? -> This is for when the user does kedro new --add-ons
  2. How does a user indicate they don't want any add-ons? Do they just press enter/ or do they specify "none"?
deepyaman commented 1 year ago

kedro new --project-name=<project-name>

Nit: Prefer kedro new --name=<project-name>, to be less long + similar to other commands that come to mind (e.g. condo create --name).

deepyaman commented 1 year ago

On the discussion during grooming, IMO having the plugins called “docs”, “lint”, “test”, etc. sounded fine, since you already have a definition in-line.

SajidAlamQB commented 1 year ago

I've looked into boolean_variables on cookiecutter and I'm not so sure they are the best approach here. Our options are multi-choices if we opt with using the boolean_variables it would mean each add-on then becomes a separate prompt for a yes/no decision. I feel this would be a detriment to user experience as it would take users longer to answer a series of yes/no questions rather than making a single multi-choice selection.

The drawback is if we go with single multi-choice option is the implementation on post_gen_project.py will be bit more complex.

merelcht commented 1 year ago

That makes a lot of sense @SajidAlamQB , I think it's fine if post_gen_project.py will be more complex. I completely agree with you that having a yes/no decision for each prompt is not a good user experience. The post gen hook isn't user facing, so a much better place to deal with complex logic.

AhdraMeraliQB commented 1 year ago

Closed by #2987