chufanchen / read-paper-and-code

0 stars 0 forks source link

CoRR 2023 | ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models #59

Open chufanchen opened 7 months ago

chufanchen commented 7 months ago

https://arxiv.org/abs/2310.10505

https://github.com/liziniu/ReMax