⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
This PR adds a new attribute to requirements to BasePipeline to keep track of the dependencies needed to run a Pipeline. The pipeline.dump() method now contains a new key with the requirements, if any.
We can include requirements at the Pipeline level, and ideally we would add requirements for custom steps via
@requirements decorator, to avoid making the step definition more verbose.
It will throw a ValueError before running and show the dependencies that aren't already installed in your environment.
Description
This PR adds a new attribute to
requirements
toBasePipeline
to keep track of the dependencies needed to run aPipeline
. Thepipeline.dump()
method now contains a new key with the requirements, if any.We can include requirements at the
Pipeline
level, and ideally we would add requirements for custom steps via@requirements
decorator, to avoid making the step definition more verbose.It will throw a
ValueError
before running and show the dependencies that aren't already installed in your environment.