⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Is your feature request related to a problem? Please describe.
With the new StructuredGeneration task we can generate datasets for function calling, and we could simplify preparing the dataset for training.
Describe the solution you'd like
Extend FormatTextGenerationSFT (FormatChatGenerationSFT too?) to include the available tools and the function calls
Describe alternatives you've considered
Let the user do it on its own.
Is your feature request related to a problem? Please describe. With the new
StructuredGeneration
task we can generate datasets for function calling, and we could simplify preparing the dataset for training.Describe the solution you'd like Extend
FormatTextGenerationSFT
(FormatChatGenerationSFT
too?) to include the available tools and the function callsDescribe alternatives you've considered Let the user do it on its own.
Additional context Use the format from Mistral function calling.