Open astrojuanlu opened 2 months ago
@astrojuanlu interesting use case. Have you seen a lot that users define pipelines in notebooks or import them to there?
I thought vast majority of notebook usage is to do catalog.load("something")
and then some EDA. While all pipeline definition is in .py
files.
Have you seen a lot that users define pipelines in notebooks
I have not, and probably the reason is that traditionally Kedro had taken sort of an anti-notebook stance. We evolved that in 2023, for example by writing https://docs.kedro.org/en/stable/notebooks_and_ipython/notebook-example/add_kedro_to_a_notebook.html
I've personally found it very handy to explain things to data scientists with notebooks when teaching. See for example https://github.com/ibis-project/kedro-ibis-tutorial/blob/main/03%20-%20First%20Steps%20with%20Kedro.ipynb, recording (very well received) or https://github.com/astrojuanlu/kedro-databricks-demo/blob/main/First%20Steps%20with%20Kedro%20on%20Databricks.ipynb (essentially the same thing, but with a ManagedTableDataset
connecting to DBX UC). Being able to visualise the pipelines there directly would be awesome I think.
or import them to there?
We launched a feature earlier this year to do something like that https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks.html#load-node-line-magic it's for nodes rather than full pipelines though.
I thought vast majority of notebook usage is to do catalog.load("something") and then some EDA.
That's our impression too yes (and in fact I do that all the time). So this issue would be about taking that one little step further.
A user just asked about this.
(And it had nothing to do with notebooks)
Hello, I add some context for my use-case after sending a message on Slack. Kedro viz diagrams are very useful for non-technical people wanting to get a high-level view of the data pipeline. While documenting models in my company internal Notion, I thought including a kedro viz diagram would be super useful, as well as generating a new one every time a change to the pipeline is released. I got the idea when I saw that Notion shows diagrams written in Mermaid, but I don't know and haven't checked if kedro viz is based on Mermaid under the hood.
Originally #1459, extra context in https://github.com/kedro-org/kedro-viz/discussions/1833#discussioncomment-9949391 reproduced below:
I am showcasing Kedro concepts on a notebook without creating a full-fledged project. Took https://github.com/ibis-project/kedro-ibis-tutorial/blob/main/03%20-%20First%20Steps%20with%20Kedro.ipynb as inspiration, and adapted it to Spark and Databricks (will try to publish that soon).
However, since there is no Kedro Framework project, there is no way I can visualise my pipelines, even though I have a
Pipeline
object perfectly defined:It would be insanely awesome if I could do
KedroViz().visualize(pipe).show()
or something like that, without ever needing to set-up a Kedro project.