Guitaricet / relora

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
https://arxiv.org/abs/2307.05695
Apache License 2.0
436 stars 39 forks source link

Once this model is trained, how should I infer ? Is there an example script? #2

Closed ScottishFold007 closed 1 year ago

ScottishFold007 commented 1 year ago

Once this model is trained, how should I infer ? Is there an example script? I'm having trouble loading the model using the following method: image

image It seems that this model lacks many parameters, can the actual prediction be good?

Guitaricet commented 1 year ago

Hi! Thank you for providing more details about your question.

Assuming .merge_and_reinit() was applied before the model was saved, you can use this code:

from peft_pretraining.modeling_llama import LlamaForCausalLM
model = LlamaForCausalLM.from_pretrained("path/to/model/save/dir")

Wrapping it with ReLoRA model is only needed for training.

datalee commented 1 year ago

.merge_and_reinit() maybe have some wrong? It's a big difference compared to lora.