kedro-org / kedro-starters

Templates for your Kedro projects.
Apache License 2.0
63 stars 57 forks source link

fix: Revise cookiecutter json in starters #205

Closed DimedS closed 5 months ago

DimedS commented 6 months ago

Motivation and Context

Removing the default values of the tools and example_pipeline variables from cookiecutter.json in the four starters is currently not feasible. Variables utilised in Cookiecutter must be declared in cookiecutter.json, even if they are consistently set during the cookiecutter() function call (which overrides values from the file). Omitting these declarations results in Cookiecutter raising an UndefinedVariableInTemplate error.

However, for clarity, it might be beneficial to leave the values blank or indicate their status with a note, such as:

tools: "protected internal variable - do not modify"
example_pipeline: "protected internal variable - do not modify"

This approach informs users of the special nature of these variables, highlighting that they are internally managed and should not be altered.

Checklist

AhdraMeraliQB commented 5 months ago

Summarising our discussion offline for posterity:

In Kedro's implementation, cookiecutter (and hence pyproject.toml) will always receive values that overwrite the defaults specified in cookiecutter.json. However, this knowledge is not available to cookiecutter, and so not supplying a default value results in an error.

I agree that indicating this is important for the understandability of the code, though I don't think empty strings are informative but perhaps something like:

tools: "default value required by cookiecutter - overwritten by Kedro implementation"
example_pipeline: "default value required by cookiecutter - overwritten by Kedro implementation"

Which informs why this default is there and it's lack of effect on the rest of the project generation process.

CC: @merelcht

DimedS commented 5 months ago

It appears that in previous instances, a recursive link was utilized to indicate that the value would be derived from the codebase when executing the cookiecutter() function, like: "kedro_version": "{{ cookiecutter.kedro_version }}", so we can keep the same or discuss any more straight meaning text

SajidAlamQB commented 5 months ago

Just so I understand are you saying that even if we are getting "kedro_version": "{{ cookiecutter.kedro_version }}" from the codebase it doesn't matter as we just replace it when we come to create the project?

If thats the case I have no issue replacing it with some descriptive placeholder for future contributors.

DimedS commented 5 months ago

Thank you, @AhdraMeraliQB, for the comprehensive description of the issue and the potential solutions. I agree with both @AhdraMeraliQB and @SajidAlamQB and have modified the variable names in accordance with @AhdraMeraliQB 's proposal. Let's wait until next week; we might receive additional opinions by then.

merelcht commented 5 months ago

I personally find the comment a bit messy and would prefer to just go for the "kedro_version": "{{ cookiecutter.kedro_version }}" style for all these fields. IMO we should document the behaviour for ourselves rather than put explanations in user facing files.

It would be good to hear what the rest of the team thinks @ankatiyar @noklam @lrcouto

In any case, whatever decision is made this should be applied across all starters for consistency.

noklam commented 5 months ago

I tend to agree @merelcht on this one. It's not something that user can change. It is still a good note that we should keep, maybe somewhere in the codebase or the develop docs if applicable.