microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
35.71k stars 4.15k forks source link

Deep Speed Optimizer index out of range during Training #6482

Open manitadayon opened 2 months ago

manitadayon commented 2 months ago

Hi I am trying to fine tune my llama model using DeepSpeed, accelerate and SFTTrainer along with QLORA. I have already pretrained my LLama model. During the pretraining, I used DeepSpeed and Peft as well with no problem. However now I am loading my base model and the PEFT adapter and try to fine tune my model I get this error:

self.dtype = self.optimizer.param_groups[0]['params'][0].dtype
IndexError: list index out of range.

I loaded the my adaptor into my base model as follows:

 adapter_path = "path"  # Update with your adapter path
 model = PeftModel.from_pretrained(model, adapter_path)

I have 2 A100 GPUs and I am using quantized llama models.

The DeepSpeed configuration is the same between pretrained and fine tune model, and I am not passing any optimizer to the config, they are all default.

I searched in the internet and could not find any related info on this error. Does anyone what this error is referring to and how to fix it?

SundayVHan commented 1 month ago

Hi! In my case, I discovered that I had frozen all parameters in my model, which led to the error.