kedro-org / kedro-viz

Visualise your Kedro data and machine-learning pipelines and track your experiments.
https://demo.kedro.org
Apache License 2.0
646 stars 106 forks source link

Add a note admonition about leveraging YAML anchors & aliases to avoid copypasting `metadata.kedro-viz.layer` in catalog #1956

Open yury-fedotov opened 1 week ago

yury-fedotov commented 1 week ago

Description

This section of docs provides a guide to adding layers to the visualization by defining them as follows:

companies:
  type: pandas.CSVDataset
  filepath: data/01_raw/companies.csv
  metadata:
    kedro-viz:
      layer: raw

Also it gives the following example below:

companies:
  type: pandas.CSVDataset
  filepath: data/01_raw/companies.csv
  metadata:
    kedro-viz:
      layer: raw

reviews:
  type: pandas.CSVDataset
  filepath: data/01_raw/reviews.csv
  metadata:
    kedro-viz:
      layer: raw

shuttles:
  type: pandas.ExcelDataset
  filepath: data/01_raw/shuttles.xlsx
  metadata:
    kedro-viz:
      layer: raw

...

Context

In my projects I found it very helpful to use YAML anchors to save those 3 lines per layer into a variable like this:

_raw_layer: &raw_layer
  metadata:
    kedro-viz:
      layer: 01_raw

And then reuse it like this:

companies:
  type: pandas.CSVDataset
  filepath: data/01_raw/companies.csv
  <<: *raw_layer

reviews:
  type: pandas.CSVDataset
  filepath: data/01_raw/reviews.csv
  <<: *raw_layer

shuttles:
  type: pandas.ExcelDataset
  filepath: data/01_raw/shuttles.xlsx
  <<: *raw_layer

Possible Implementation

What I propose to do it to add a small note admonition suggesting that YAML anchors & aliases can be a great fit here to avoid copypasting those 3 lines if you have e.g. 10 datasets defined in a layer.

By admonition I mean e.g. this:

Screenshot 2024-06-24 at 10 35 00 PM

It can mention that this feature is not Kedro-specific at all and enabled by YAML format itself, but I think it can be helpful since this trick is highly reusable and can simplify large catalogs quite a lot for users unfamiliar with anchors & aliases in YAML.

I do not propose to change the existing example which replicates those 3 lines 3 times. I think my suggestion better fits a note admonition.

Checklist

yury-fedotov commented 1 week ago

LMK if that's something you would want in the docs, I'm happy to open a PR if so.