kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.83k stars 895 forks source link

Improve documentation about `%load_ext kedro.ipython` to make it more visible #2777

Closed astrojuanlu closed 6 months ago

astrojuanlu commented 1 year ago

Description

We should make %load_ext kedro.ipython more visible, since it's tool-independent (as long as it's IPython-compatible) and also independent on how the service was launched (think for instance Databricks notebooks, Google Colab, and VSCode).

Context

A user wanted to use the Kedro extension on VSCode and didn't find a way in the docs https://www.linen.dev/s/kedro/t/13090672/hi-everyone-is-there-some-setting-that-could-allow-for-kedro#30d9d3e7-73e3-4719-8be5-882a947088e4 cc @m-gris

even if it's there already https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks.html#managed-services.

In fact, I don't even use kedro jupyter notebook that much 🤔 always resort to %load_ext kedro.ipython everywhere.

Possible Implementation

We should probably restructure those pages a bit, so that it's clear that the command can be used in many places, and not just "managed services".

Possible Alternatives

I didn't think any.

astrojuanlu commented 1 year ago

I almost open this issue again 🙃

Another user got confused because they couldn't really use kedro jupyter notebook/lab in their environment, and %load_ext kedro.ipython worked perfectly, but I had to show them the solution:

https://linen-slack.kedro.org/t/15706050/hi-everyone-i-am-getting-acquainted-with-kedro-and-i-want-to#51630c54-7080-4cc7-bc5a-4cdba1d1cda9

astrojuanlu commented 10 months ago

For context, kedro jupyter notebook creates a kernel and launches Jupyter with it

https://github.com/kedro-org/kedro/blob/c3c93cb4786b4fd38c62baa2ec7f6fc73a63791e/kedro/framework/cli/jupyter.py#L90-L91

astrojuanlu commented 10 months ago

@noklam mentions that kedro jupyter setup works so-so on Databricks, Amazon Sagemaker and so on.

stichbury commented 10 months ago

kedro jupyter notebook is clearly a Kedro command in that it's formatted in a similar way to all our others

%load_ext kedro.ipython is a weird looking thing that isn't easy to parse, remember or understand. I'm unconvinced on how to document this at first sight and I think we instead need to be documenting the problem that it solves. As in "How do I do X?" and the answer "You do it with load_ext kedro.ipython" and be really clear which section of the docs it goes into.

I'm happy to do this but need more steer about (a) the problem and (b) the location in docs that we include it.

astrojuanlu commented 10 months ago

As in "How do I do X?" and the answer "You do it with load_ext kedro.ipython" and be really clear which section of the docs it goes into.

💯 %

%load_ext kedro.ipython, kedro jupyter notebook and kedro jupyter init all solve the same problem: allowing users to quickly load a Kedro project from a Python interactive interpreter. The first one is an IPython command https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-load_ext the other two are, as you say, Kedro commands.

The problem of the latter two is that they're too rigid:

So I think %load_ext kedro.ipython should be at the very top of https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks.html, and then alternative workflows should be explained as a way to "not even have to %load_ext".

stichbury commented 7 months ago

See also comment in this thread: https://linen-slack.kedro.org/t/16320811/i-just-learned-that-the-command-kedro-jupyter-notebook-can-b#7a074598-5ddb-457f-ad73-d4a949158725

astrojuanlu commented 7 months ago

Another user whose problems with kedro jupyter were solved by using %load_ext kedro.ipython https://linen-slack.kedro.org/t/16331010/hi-sometimes-i-have-trouble-running-kedro-jupyter-notebook-i#50ac0562-8330-4697-982e-2892e2127c47

noklam commented 7 months ago

Almost duplicate the issue. I agree with everything said here. Ideally the Kedro specific kernel is a nice idea, in practices it suffers from different issue.

  1. User don't select the correct kernel, clicking the wrong kernel on Jupyter/VSCode notebook
  2. Virtual environment / Jupyter Kernel confusion - at the end the difference is subtle and people just assume they are the same. When it's not, it surprise people in a bad way.
  3. Cross platform support for kernel seems to be hard - #2412
astrojuanlu commented 7 months ago

Another user who could have benefitted from this https://linen-slack.kedro.org/t/16320811/i-just-learned-that-the-command-kedro-jupyter-notebook-can-b#435be7c7-e8cd-4791-b6ca-37637fd8b194

astrojuanlu commented 6 months ago

yay ❤️