liziniu / ReMax

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
151 stars 13 forks source link

Bugs when using zero-stage3 #1

Closed George-Chia closed 12 months ago

George-Chia commented 12 months ago

Great work! I met the following bug when I ran your code with REFERENCE_ZERO_STAGE=3. AttributeError: 'LlamaAttention' object has no attribute 'rope_theta'. I believe this is inherited from DS-Chat. But I still wonder how you fix it?