Closed cdoern closed 4 months ago
cc @younesbelkada
Just looking for clarity on if this is normal and if so how to get the trained model in some sane format I can carry on with.
@younesbelkada sorry for the ping, but this is a pretty big blocker for some work I am doing. Is there a quick howto I can get on PEFT+Deepspeed if my understanding is lacking here?
Hi @cdoern
Apologies for my late reply, in the latest PEFT / transformers we should support QLoRA + DeepSpeed on all stages: https://huggingface.co/docs/peft/accelerate/deepspeed see that documentation page for more details
You are using QLoRA meaning you are fine-tuning adapters on top of the full model (i.e. base model), it is expected that you only have adapter_model.safetensors
file. Note you can load directly the adapters with AutoModelForCausalLM.from_pretrained
(see https://huggingface.co/docs/transformers/peft for reference), or you can use PeftModel.from_pretrained
from peft and call model.merge_and_unload()
to merge the adapter model into the base model and save the merged model as a standalone model.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Python 3.11.9 Transformers 4.38.2 torch 2.3.0
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
All of the below steps occur when creating a trainer using a python script and running trainer.train()
Expected behavior
I believe there should be pytorch_model.bin for me to load. If LorA is somehow messing with where everything gets stored, this isn't documented anywhere. Training succeeds, but I am not merging or unloading anything as I am not sure where to grab the proper weights from since the info in the docs doesn't cover this.