Galileo-Galilei / kedro-pandera

A kedro plugin to use pandera in your kedro projects
https://kedro-pandera.readthedocs.io/en/latest/
Apache License 2.0
33 stars 4 forks source link

`kedro pandera coverage` #34

Open datajoely opened 11 months ago

datajoely commented 11 months ago

Description

The more I think about the importance of data contracts ensuring coverage checks as part of a team's workflow feels like a natural evolution of this pattern.

Context

The way I see this, there are two standards a user should aim for:

Possible Implementation

Possible Alternatives

Galileo-Galilei commented 11 months ago

I like the idea, but why would we need AST introspection? Couldn't just we if all the dataset in a pipeline have a schema attached in their metadata? Is it related to the decorator way to trigger data checks?

datajoely commented 11 months ago

So I was thinking about supporting the Python annotations (as well as the catalog metadata), but we don't actually have to do that using static analysis we can actually just inspect the live objects at pipeline creation time.