argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
https://distilabel.argilla.io
Apache License 2.0
1.45k stars 111 forks source link

[FEATURE] Add `RayPipeline` class #717

Closed gabrielmbmb closed 2 months ago

gabrielmbmb commented 3 months ago

Description

Add a RayPipeline class that allows executing a distilabel pipeline in a Ray cluster. This will allow scaling distilabel pipelines.