kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.84k stars 894 forks source link

Parent task: Content on Kedro + complementary tools (integrations with other tools, best practices and tutorials) #2817

Open stichbury opened 1 year ago

stichbury commented 1 year ago

Parent level task to cover a range of integrative content to include, some or all of the following:

Great Expectations Pandera Weights & Biases DVC MLflow Prefect Dagster ZenML DBT Intake

stichbury commented 1 year ago

@astrojuanlu This is part 3 of the process described in #3012 -- it's a parent task for any writing we do about complementary tools and how to integrate with Kedro. I'll need your guidance on which to prioritize, which to omit and which to add.

astrojuanlu commented 1 year ago

Complements: Let's prioritize pandera (kedro-pandera), mlflow (kedro-mlflow). Prefect already has some docs around it.

Competitors: Let's prioritize dbt, MLflow (yes, it's a competitor and a complement), DVC.

The rest (GE, W&B, Dagster, ZenML, Hamilton, Intake) can come later.