Open jmholzer opened 1 year ago
This feature makes it possible to run Kedro e.g. on databricks jobs. I get the point that "repo-name" isn't super intuitive, but that's only visible in the "raw" framework and starters template anyway, so when a user has created a project that's already translated into the actual name of the project. We've also never had users flag this, so I think we can just close this issue. @astrojuanlu ?
If I understand, this is actually not the entrypoint we use for databricks. This will work as long as the __main__.py
exist as this is how Python execute a module, i.e. python -m <package>
.
The extra entrypoint is a weird one that we never document. For example, when you create a spaceflights project call my_project
. It creates a CLI that you can now do my_project
in terminal, as if it is doing kedro run
. IMO we should remove it since no one is using it, and adding an extra way to run a project is confusing but doesn't add much benefit.
So if I understand correctly, python -m <package>
will always work, and this issue is about removing the package_name
CLI, right?
I see the package_name
CLI is not mentioned in our tutorial https://docs.kedro.org/en/stable/tutorial/package_a_project.html#run-a-packaged-project nor in our single-machine deployment page https://docs.kedro.org/en/stable/deployment/single_machine.html#package-based
My guess is that having this extra way of running the project adds very little value, given that users can already:
python -m package
from <package_name>.__main__ import main; main()
in PythonKedroSession
manually https://docs.kedro.org/en/stable/kedro_project_setup/session.htmlAnd also this is just about the defaults in the template right? Users can still define their own entry point.
Ah sorry, I was confused! It does look like we can just remove that entrypoint. Removing repo_name
from the cookiecutter settings, might not be entirely straightforward though. I ran into some issues testing it just now, so we'll have to make sure removing it doesn't cause issues with older versions.
Description
Currently, we expose an entry point to packaged projects that corresponds to
cookiecutter.repo_name
. This is done bysrc/setup.py
in our project template and starters:Update 22/10/2024 This is now in
pyproject.toml
:This allows a user to run their installed, packaged projects from the command line by using the 'repo name' of their project, which is defined in
cookiecutter.json
as follows:This has been a part of the code base since Kedro 0.14.0, though it is not documented anywhere. I do not think we should include undocumented features in Kedro, so we have two options:
I do not prefer option 1, neither does @noklam. This is because the concept of a 'repo name' is not documented anywhere and adding it would cause our users confusion, since its meaning is not intuitive. In addition, there is already a more intuitive way of running a packaged project, using
python -m <package_name>
. For these reasons, @noklam and I prefer option 2.Possible alternatives
We could discuss modifying the entry point, assigning a different command to it that is more intuitive. However,
package_name
contains underscores, which is inconsistent with the CLI exposed bykedro run
andproject_name
is also unsuitable for the reason that it can contain spaces.