kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.48k stars 874 forks source link

Support for Cyclic Dependencies in Kedro Pipelines for Reinforcement Learning Scenarios #3815

Closed Sino-Huang closed 2 months ago

Sino-Huang commented 2 months ago

Description

I'm currently facing challenges with the Kedro pipeline structure, specifically its limitation to Directed Acyclic Graphs (DAGs). In reinforcement learning applications, the ability to create cyclic loops within the pipeline is crucial. For instance, a learning policy generates data that is then used to further train and refine the same policy. The current DAG structure does not support these types of cyclic dependencies, which is limiting for projects that involve iterative data generation and processing loops. Context

The addition of support for cyclic dependencies is important because it would allow for more flexible pipeline configurations, especially beneficial in the context of AI and machine learning projects where iterative feedback loops are common. This feature would not only benefit my projects but also broaden Kedro's applicability in advanced machine learning scenarios, promoting its adoption and enhancing its utility. Possible Implementation

One way to implement this could be by allowing users to define nodes or sub-pipelines that can conditionally loop back to earlier stages based on runtime data or conditions. Or let's have a counter to count the loop.