Closed ankatiyar closed 3 months ago
To note, as @ankatiyar noted internally, "bad" lookups will take a lot of time: for example, kedro infoo
(notice typo) took 17 seconds in my computer and tried to import kedro-viz, kedro-mlflow, and more.
Going forward, is there a smarter way we can declare new CLI commands, so that we can look them up in entry points without actually importing anything? (Feel free to open a new issue about this)
The code coverage is not complete but opening this up for review anyway to gather feedback on the implementation.
Fair point @merelcht! 🤔
I was also trying to implement a way where --help
does trigger the loading of the plugins.
About overwriting of core Kedro commands eg. micropkg
, that might also be possible with lazy loading. Let me explore these functionalities and update the PR (if it works!)!
Unfortunately, I don't think the overriding behaviour of CLI is possible with lazy loading of plugins - i.e. if a plugin has overriding CLI commands for Kedro core commands such as kedro catalog list
etc since the premise of the proposal is to only load plugins if a command does not exist in KedroCLI.
I think it's possible to trigger the loading of plugins and display plugin commands with kedro --help
but I would like to get the team's opinions on if we should be going ahead with this at all
cc @astrojuanlu @noklam @merelcht
The python
CLI has --help
and --help-all
. Maybe for the next breaking release we could make --help
display only the "core" commands, and --help-all
display everything.
For retaining the current behavior while attaining much faster load times for all the rest of commands, I'd be 👍🏼 on force-triggering loading of all plugins in kedro --help
Moving this back to the drafts, will close it if no other comments come in. I'll focus on getting https://github.com/kedro-org/kedro-viz/pull/1920 and https://github.com/kedro-org/kedro/pull/3883 ready for review first!
Closing this in favour of the other solution!
Description
Partly resolve #1476
As discussed in #1476, the initialisation of
KedroCLI
takes up a significant chunk of time. This is especially evident if you have a lot of kedro plugins installed in the environment asKedroCLI
which is aCommandCollection
, loads up all the commands from the plugins before the command is processed. This PR is to add a lazy way to load the commands from the plugins.Development notes
plugin_groups
which returns adict
ofentry_point_name
to theEntryPoint
object.CommandCollection
which is a custom version ofclick.CommandCollection
used by Kedro and defined inkedro/framework/cli/utils.py
Update to custom
CommandCollection
:main
function which receives the command to be run asargs
, eg[ "catalog", "list"]
or["airflow", "create"]
args
is not in the command, then load commands from pluginssuper().main()
Loading of plugins:
I tried to implement a sort of "smart" loading where if the command arg, i.e.
airflow
ormlflow
orviz
partially matches the entry point names in thelazy_group
dict keys, load the plugin, check if the command now exists and exit. Note: The entry point names are decided by the plugins, eg.kedro-airflow
's ep name isairflow
which fully matches the command name too but forkedro-viz
andkedro-mlflow
the entry point names don't match the commands. eg. Command:viz
Entry point name:kedro-viz
Otherwise, load all plugins one by one, exit if the command exists after loading plugins.
TODO
Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file