Open WaylonWalker opened 3 years ago
@WaylonWalker do you know a way to load a certain dataset of the catalog with the current version of kedro 0.17.0? as you propose to do.
I just added a very basic setup.py
and an __init__.py
with a catalog loader to make it work.
@JavierHernandezMontes is your data local or remote? For local data that needs packaged in it might be a bit trickier.
@WaylonWalker Is this still relevant with the IPython
extension?
At the moment, using Kedro without using the project template is entirely possible: pip install kedro
and then instantiating the catalog directly:
from kedro.config import OmegaConfigLoader
from kedro.io import DataCatalog
conf_loader = OmegaConfigLoader("conf")
conf_catalog = conf_loader.get("catalog")
catalog = DataCatalog.from_config(conf_catalog)
catalog.load(...)
On IPython & Jupyter, this is one line:
%load_ext kedro.ipython
catalog.load(...)
I agree it would be nice to make the boilerplate above go away, so I'm moving this issue and renaming it for our consideration. It will have more visibility on the framework repo.
We are exploring alternative approaches, see gh-2819
And #2967
More evidence of users using the Kedro catalog without the pipelines: https://github.com/kedro-org/kedro/issues/2898#issuecomment-1736000094
Another user asking for exactly this https://linen-slack.kedro.org/t/16593946/is-there-a-way-of-installing-only-the-data-catalog-part-of-k#b6d532c4-2d7f-4add-b0ee-b0bfcffbdd5e
Would it make sense to make
mini-kedro
installable? My use case for projects like that are users doing EDA and just want easy access to the data with no fuss.If it is something that makes sense I propose adding a setup.py to make it installable, and a single module that sets the catalog up for them, then they can access the project's data as follows.
Alternatively
This could also be a separate starter that is a
kedro-catalog
starter.