Open namedgraph opened 3 weeks ago
Hey @namedgraph, thank you for your feature proposal. Your idea makes sense, but as of now, Kedro does not support grouping artifacts in the manner you describe, and interprets each entry on the catalog as a separate data source with it's own type definition.
For now, you can try to use Kedro dataset factories to reduce the number of similar catalog entries on your project.
@lrcouto it feels inconsistent that one can nest YAML in parameters and use the parent:child
syntax, but not in the catalog 🤷♂️
Description
I tried grouping the artifacts by introducing "namespaces" as the first level of config in YAML while moving the actual artifacts to the second level:
and was planning to address the artifacts as
a_group_of_artifacts:outputs
,a_group_of_artifacts:errors
etc.But it turns out that Kedro does not support this?
Context
Our pipelines mostly augment the initial inputs, which means we end up with a lot of similarly named artifacts (e.g.
final_outputs
,processed_outputs
and other kinds of_outputs
) which gets confusing. It feels that there should be a better way to group/namespace the artifacts.Possible Implementation
Instead of treating the 1st-level YAML blocks as artifacts, why not traverse the levels recursively until a block with
type
is encountered -- and treating it as artifact while ignoring the other nesting blocks?Possible Alternatives
Maybe some other solution I don't know about? Not a Kedro expert...