Open francisduval opened 9 months ago
Backlog grooming notes:
This was also highlighted in #1750, and would build on the dataset preview and debugging work stream. We should consider implementing this. Next step - investigation of technical feasibility.
Description
When running
kedro viz run
, there is no way to know which datasets are up to date and which ones are outdated. A dataset is said to be outdated if the code upstream has changed since the dataset was run for the last time. This feature exists with the Targets package in R. Also, when you run the targets pipeline, only nodes that are outdated are run, which saves computing time.Context
This could be a nice feature since without it, there is no effective way to tell which parts of the pipeline you should rerun when changes have been made to the code. Sometimes, you are unsure if a dataset is up to date or not, and then you have to rerun it to be sure, which can take a long time.
Possible Implementation
Color datasets that are outdated with another color. Also, it would be nice to have a
kedro
command that would only run outdated datasets, such askedro run --only_outdated
orkedro run --pipeline pipeline_name --only_outdated
.Checklist