argilla-io / distilabel

⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
https://distilabel.argilla.io
Apache License 2.0
1.12k stars 70 forks source link

[FEATURE] Allow `FormatTextGenerationSFT` to include tools/function calls in the formatted messages. #737

Open plaguss opened 2 weeks ago

plaguss commented 2 weeks ago

Is your feature request related to a problem? Please describe. With the new StructuredGeneration task we can generate datasets for function calling, and we could simplify preparing the dataset for training.

Describe the solution you'd like Extend FormatTextGenerationSFT (FormatChatGenerationSFT too?) to include the available tools and the function calls

Describe alternatives you've considered Let the user do it on its own.

Additional context Use the format from Mistral function calling.