kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
10.03k stars 906 forks source link

Add memory profiling for catalog #4264

Open noklam opened 1 month ago

noklam commented 1 month ago

Context

In the current setup, we use the same pipeline and catalog configuration and we try to test memory usage by running nodes with heavy compute load that should not grow the memory. So it feels like we might add other scenarios that grow it and we're able to compare (memory grows as expected) runs on them.

_Originally posted by @ElenaKhaustova in https://github.com/kedro-org/kedro/pull/4210#discussion_r1819227669_

Description

Add memory profiling for catalogs.