kedro-org / kedro-starters

Templates for your Kedro projects.
Apache License 2.0
64 stars 59 forks source link

`standalone-datacatalog` doesn't have Kedro metadata #115

Closed astrojuanlu closed 9 months ago

astrojuanlu commented 1 year ago

Description

As per title. pyproject.toml with the Kedro metadata is missing, which leads to some subcommands not being available, for example kedro jupyter:

https://github.com/kedro-org/kedro/blob/116ddd015e81d2a6930a0dfbe83e630e526634f4/kedro/framework/cli/cli.py#L172-L173

Context

I was trying to use the standalone-datacatalog from Jupyter to have a minimal Kedro setup from a notebook, but found that the kedro.ipython extension was not loading.

Steps to Reproduce

  1. kedro new -s standalone-datacatalog
  2. Try kedro jupyter notebook
  3. See it fail because it's not found

Expected Result

kedro jupyter works for all starters.

Actual Result

juan_cano@M-PH9T4K3P3C /t/test-kedro-ipython-mini> kedro jupyter notebook                                          (kpolars310) 
Usage: kedro [OPTIONS] COMMAND [ARGS]...
Try 'kedro -h' for help.

Error: No such command 'jupyter'.
juan_cano@M-PH9T4K3P3C /t/test-kedro-ipython-mini [2]> kedro                                                       (kpolars310) 
Usage: kedro [OPTIONS] COMMAND [ARGS]...

  Kedro is a CLI for creating and using Kedro projects. For more information,
  type ``kedro info``.

Options:
  -V, --version  Show version and exit
  -h, --help     Show this message and exit.

Global commands from Kedro
Commands:
  docs     See the kedro API docs and introductory tutorial.
  info     Get more information about kedro.
  new      Create a new kedro project.
  starter  Commands for working with project starters.

Your Environment

noklam commented 1 year ago

@astrojuanlu Thanks for picking this up.

This is extracted from the README

Limitation ... If you stick with this project structure without transitioning to a full project, you will only be able to use Kedro’s library components such as the DataCatalog, Node and Pipeline manually. You won’t be able to use features available in a full Kedro project, including project-based CLI commands such as kedro run.

This included kedro jupyter, as we also suggest the users launch the notebook with jupyter notebook instead of the kedro command.

I do realize this is a bit awkward, but I don't think adding the pyproject.toml to make it a full project is the correct approach since we want to make it standalone instead of a full Python package.

astrojuanlu commented 1 year ago

Thanks @noklam for the extra context (well, and also for pasting the README I should have, errr, read).

Maybe we can work on this from a different angle. Given that the kedro.ipython extension purpose is to give easy access to catalog, context, pipelines and session, would it be possible for it to work without metadata? I guess the only missing bit would be the Kedro session.

Otherwise, let's just turn this issue into a docs issue so that we can adapt this sentence or add a warning around it:

https://github.com/kedro-org/kedro/blob/393d9d202f97bf07c8b1fc9cba6493979d6e958c/docs/source/notebooks_and_ipython/kedro_and_notebooks.md?plain=1#L9

noklam commented 1 year ago

@astrojuanlu

I think the kedro.ipython also need a full Kedro project to work with, but maybe there is something we can work around.

I am going to give this a closer look while I work on the kedro jupyter-init command.

merelcht commented 1 year ago

Needs to be taken into account for the user journey where users incrementally onboard to Kedro.

noklam commented 1 year ago

Answering the question earlier:

noklam commented 9 months ago

We removed this starter right?

astrojuanlu commented 9 months ago

It got archived in #184 indeed