daje0601 / Google_SCoRe

Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
Apache License 2.0
111 stars 16 forks source link