Open noklam opened 2 years ago
Looking back through the discussion in https://github.com/kedro-org/kedro/pull/1265, I can confirm that @Galileo-Galilei independently had the same feeling as me on this:
It makes sense for kedro itself to have a default to the starter matching the kedro version, but i don't think it really does for user defined starters.
He also thought checkout
should not become part of KedroStarterSpec
, which is fine by me. So the behaviour would be exactly as you describe above 👍
Default checkout version should be None, cookiecutters should take cares of pulling the latest version. (Do check if this is the case)
I can confirm from looking through cookiecutter that this is correct. If checkout
is None
then it clones the repo and doesn't checkout any particular branch, which means it will automatically be on HEAD
. So we don't need to do anything special here - it will work by itself.
The only catch with changing this is that it's a breaking change, so we will need to put in some warning that the behaviour will change when the user calls a starter that's not a built-in one.
It's unclear to me what problem does this solve, would you folks please clarify? Also, wondering if any of the new project tools affect this.
From memory, the main problem goes like this:
pip install kedro==0.18.2
and does kedro new -s ...
➡️ error message saying git can't checkout the starter with tag/branch 0.18.2
0.18.2
to their git repo and all works okpip install kedro==0.18.3
kedro new -s ...
➡️ error message saying git can't checkout the starter with tag/branch 0.18.3
0.18.x
, which is specifically designed to be backwards compatible with any previous 0.18.x
project template and yet necessitates manual intervention from starter maintainer in order to have existing workflows not breakThe current solution (other than starter maintainer making a new tag every single kedro release) would be for the user to explicitly specify --checkout main
or similar. The problem is that this should really be the default behaviour rather than needing to be explicitly specified.
The only possible problem that changing this causes is that it would mean the default --checkout
value would be different in official kedro starters (where it would use kedro version) compared to all others (where it would take None
and so default to e.g. main
branch as above). This is ok by me, although maybe it's worth considering if the same change should actually be made for official kedro starters also.
Background
Currently
kedro new
assumes the starter uses the same version askedro
, i.e. With kedro==0.18.0 it will pull the0.18.0
kedro-starters
template. This makes sense when we only havekedro-starter
. We introduced the starters template plugin in0.18.2
and this assumption no longer holds.Description
None
,cookiecutters
should take cares of pulling the latest version. (Do check if this is the case)kedro
official starters should be a special case in this, so we need to keep this backward compatibility wherekedro new --starter=pandas-iris
still pull thestarter
version that matchkedro version
Example Kedro version = 0.18.2
Official starter:
kedro new --starter=pandas-iris
- It should pull starters with0.18.2
tagCommunity starter:
kedro new --starter=plugin_starter
- It should pull themain
branch whencheckout
argument isn't providedkedro new --starter=xxxxx --checkout branch_name
- It should still respect thecheckout
argument and the behavior should not be changed.[ ] Reach out to @Galileo-Galilei and @WaylonWalker to check if the above proposal would be a breaking change for them. If not, we can introduce this change as non-breaking.
[ ] Do a poll to check with the wider community