Dahoas / QDSyntheticData

11 stars 16 forks source link

Teaching LLMs to reason with RL #277

Closed Dahoas closed 3 months ago

Dahoas commented 3 months ago

closes #235