Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
There is a discrepancy in how we specify the project tools selection through the CLI (shorthand names) VS through the interactive flow (a numbered selection). However, when we suggest using the CLI flag --tools after the interactive flow, there is no acknowledgement of this discrepancy, which could cause potential confusion.
The hint currently reads:
To skip the interactive flow you can run kedro new withkedro new --name=<your-project-name> --tools=<your-project-tools> --example=<yes/no>
Instead, --tools=<your-project-tools> should provide information on the format to use, perhaps something like --tools=<lint/test/log/data/docs/pyspark/viz>? CC @iamelijahko @stephkaiser for opinions
Project Tools
=============
These optional tools can help you apply software engineering best practices.
To skip this step in future use --tools
To find out more: https://docs.kedro.org/en/stable/starters/new_project_tools.html
Tools
1) Lint: Basic linting with Black and Ruff
2) Test: Basic testing with pytest
3) Log: Additional, environment-specific logging options
4) Docs: A Sphinx documentation setup
5) Data Folder: A folder structure for data management
6) PySpark: Configuration for working with PySpark
7) Kedro-Viz: Kedro's native visualisation tool
Which tools would you like to include in your project? [1-7/1,3/all/none]:
[none]:
And tools need to be chosen through the number. But if you do kedro new --tools=... the tools need to be chosen through the short name.
[!NOTE]
This will be a combined design & engineering task
Description
There is a discrepancy in how we specify the project tools selection through the CLI (shorthand names) VS through the interactive flow (a numbered selection). However, when we suggest using the CLI flag
--tools
after the interactive flow, there is no acknowledgement of this discrepancy, which could cause potential confusion.The hint currently reads:
To skip the interactive flow you can run kedro new with
kedro new --name=<your-project-name> --tools=<your-project-tools> --example=<yes/no>
Instead,
--tools=<your-project-tools>
should provide information on the format to use, perhaps something like--tools=<lint/test/log/data/docs/pyspark/viz>
? CC @iamelijahko @stephkaiser for opinionsThis page in the docs will also need updating: https://docs.kedro.org/en/latest/get_started/new_project.html#project-tools
Context
When you run
kedro new
you'll seeAnd tools need to be chosen through the number. But if you do
kedro new --tools=...
the tools need to be chosen through the short name.