kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.97k stars 904 forks source link

Kedro new starter CLI : user_input.lower() #3783

Open Ipsedo opened 7 months ago

Ipsedo commented 7 months ago

Description

I use kedro starter CLI configured by prompt.yml with regex check and I think I found one issue : the user inputs are systematically transformed to lowercase : https://github.com/kedro-org/kedro/blob/main/kedro/framework/cli/starters.py#L957 The result is that I can't restrict user input to uppercase or lowercase

Context

I want to restrict user input to uppercase

Steps to Reproduce

  1. create a cookiecutter template for kedro (regarding your need)
  2. create your cookiecutter.json with project_name as unique prompt
  3. create your prompt.yml with an entry for project_name
  4. set the regex_validator field to "[A-Z_]+"
  5. start kedro new --starter=./my_starter and answer MY_PROJECT for project_name prompt
  6. It will fail due to the user_input.lower() in kedro/framework/cli/starters.py line 957

Expected Result

The regex and the input must match

Actual Result

The user input is refused

Your Environment

Python : 3.9.13 Kedro : 0.19.3 OS : Linux (distribution and kernel version confidential)

noklam commented 7 months ago

I think that is by design that we don't allow upper case. As I understand this is the convention for Python community.

https://gist.github.com/etigui/7600441926e73c3385057718c2fdef8e

However the error may not be clear enough, curious if you are creating your own starters?

Ipsedo commented 7 months ago

I know about python naming convention, the point is not here : the regex of prompt for project name in uppercase was just for the example. In reality I use the regex for a variable value in the project which needs to be in uppercase, your code is clear : you apply lower() method on the user input which is not correct.

noklam commented 7 months ago

@lpsedo got it, would you like to send a PR to fix this? It should be a relative simple one happy to merge this.

Ipsedo commented 7 months ago

Sure, I will potentially make the PR this weekend

merelcht commented 5 months ago

Hi @Ipsedo are you still interested in creating a PR for this?

parthmshah1302 commented 4 months ago

@merelcht I would like to work on this - do I need to be assigned first or should I submit a PR directly?

merelcht commented 4 months ago

Hi @parthmshah1302, thanks for helping out! You can just go ahead and create a PR when you're ready šŸ™‚

AdityaS8804 commented 1 month ago

Hi, could I please get a status update on this? Iā€™d like to work on it if possible.

merelcht commented 1 month ago

Hi @AdityaS8804, the ticket is still open, so you can just go ahead and create a PR when you're ready šŸ™‚

astrojuanlu commented 4 days ago

Summary:

The docs say that starter creators can add their own variables with custom regex_validator:

https://docs.kedro.org/en/stable/starters/create_a_starter.html#custom-project-creation-variables

but Kedro is not allowing uppercase letters even in variables:

https://github.com/kedro-org/kedro/blob/87773e85a4ac9e06e3c6b83461a6ffe9510cf7d3/kedro/framework/cli/starters.py#L957

Future takers of this issue: don't ask for permission, just go ahead and open a PR. If you need clarification about the problem, don't hesitate to ask.