hbin0701 / Self-Explore

[EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards
https://arxiv.org/abs/2404.10346
44 stars 2 forks source link