Closed Hamana0509 closed 1 week ago
@frankaging I have re-implemented ORPOTrainer similar to the DPOTrainer example, here is my code: https://colab.research.google.com/drive/1nKikg1c1-J5jGvlrqS995hNo6WqKmFXw?usp=sharing
@Hamana0509 Thanks for sharing your notebook and raising your question!
I think the problem is that current model loading does not work well if you are trying to load a LoRA+ReFT model. To resolve this, you have to load weights manually by creating a random init LoRA+ReFT model, and load saved weights back.
For your trainer, feel free to open a PR to submit this! it would be a great contribution. Thanks!
@frankaging thank you
I was trained and saved REFT LoRA, modules for the Llama3-8B-Instruct model. But when I load them from HuggingFace to inference, I get the following error:
What does the error above mean? And how to fix it?