What do our users use parameters for?

yetudada commented 1 year ago

Description

Parameters in a Kedro project has been a long-standing feature. Users are able to interact with parameters for the data engineering and data science pipelines by using:

the parameters.yml file located in conf/base
a parameters folder and adding relevant parameters according to their own naming scheme or because Kedro creates parameter files when you create a new pipeline
- the CLI and passing overriding parameters specified in the YAML files

We have yet to learn what our users have been using parameters for, and it's time to understand this so that we can better support their use cases.

The scope of this ticket includes:

A research exercise in unpacking the different use cases for parameters
- Additionally, we may need to define the different types of parameters that our users interact with - this language is vague
A design exercise to recommend ways that we can address these use cases

Data sources

There are a few data sources that can be leveraged to support this project:

Qualitative:
- Interviews with users; we will need to compile the list
- Observation studies; combing through existing Kedro projects to see how parameters were used or ask our users to show us what they've done
Quantitative:
- Have a look at telemetry data from kedro-telemetry to see the use case for kedro run --params param_key1:value1,param_key2:2.0 in action

Random thoughts

I have been curious about supporting the reporting use case, e.g. it appears that some teams will create Jupyter notebooks and include parameters that can be changed in the notebook. The idea is that an end user can change those parameters, run the pipeline from there, and then see the charts and generated tables. If this use case is prevalent (I have no data to say that it is), could this be supported on Kedro-Viz - users change the parameters, run the pipelines and see dashboards?

datajoely commented 1 year ago

I think telemetry will be masked at source, we won't be able to check the contents of the parameterisation it will just look like kedro run --params ***** irrespective if there are comma delimited values

datajoely commented 3 months ago

We have a lot of internal validation that passing the global parameters mega block should be discouraged and we should encourage users to provide as specific keys as possible. It makes typing much much easier and maintainable.

I would love to see the parameters mega block be dropped in 1.0.0

kedro-org / kedro