kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.88k stars 894 forks source link

Move PySpark to end of Tools prompt #3903

Closed bpmeek closed 2 weeks ago

bpmeek commented 4 months ago

Description

I'm always frustrated when creating a new Kedro project because I generally want all of the features except PySpark.

Context

Most of the new project tools available are carry-overs from previous standard starters except PySpark, I think it makes sense then to switch PySpark and Kedro-Viz options to allow users to input 1-6 to get all tools other than PySpark, which I imagine would be used less frequently than Kedro-Viz.

Possible Alternatives

You could add an "except" flag to be all except 6 or something along those lines.

DimedS commented 3 months ago

Thanks, @bpmeek . That makes sense to me as well. I will add it to the sprint planning

merelcht commented 2 months ago

Hi @bpmeek, do you know you can also do 1-5,7 as a selection option when going through the kedro new flow?

bpmeek commented 2 months ago

@merelcht I was not aware of that, that certainly helps!

merelcht commented 2 months ago

Great! Moving PySpark to the end of the tools prompt is a breaking change, so we'd rather not do it unless a large part of our user base wants it. We can leave the issue open to gather opinions, or if you're happy with the selection syntax above, we'll close it. @bpmeek

merelcht commented 2 weeks ago

Closing this for now, since no more comments have come in.