AkihikoWatanabe / paper_notes

たまに追加される論文メモ
https://AkihikoWatanabe.github.io/paper_notes
17 stars 0 forks source link

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning, Zhiheng Xi+, N/A, arXiv'24 #1392

Open AkihikoWatanabe opened 2 weeks ago

AkihikoWatanabe commented 2 weeks ago

URL