argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
https://distilabel.argilla.io
Apache License 2.0
1.45k stars 111 forks source link

[IMPLEMENTATION] Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models #749

Open gabrielmbmb opened 3 months ago

gabrielmbmb commented 3 months ago

https://arxiv.org/abs/2406.13542