IBM / Dromedary

Dromedary: towards helpful, ethical and reliable LLMs.
GNU General Public License v3.0
1.13k stars 87 forks source link

adapter_name problem #15

Open Harry-mic opened 11 months ago

Harry-mic commented 11 months ago

Hi! I encounter an issue that when doing the Step3(SFT).

The function "get_accelerate_model" in qlora_model.py sets the adapter_name="lora_default". This results in an error that the trainable parameters are set to 0.0 rather than 1.6% of the full parameters:

def get_accelerate_model(
    args: Namespace,
    checkpoint_dir: Optional[str] = None,
    adapter_name="lora_default",
    is_trainable=True,
    reuse_base_model=False,
):

I fix this by setting the adapter_name="default". I am finetuning a llama-2-7b-hf model and I wonder if it is a bug or an issue caused by the different finetuned model(7b and 70b)

Edward-Sun commented 11 months ago

Hi, it's unlike the problem of the difference between 7b and 70b.

What is the version of the PEFT you used? We use peft==4.0.0 in our experiments, and perhaps the behavior is different in newer or older versions.

Harry-mic commented 11 months ago

Thanks for your reply!

I use the peft==0.6.3.dev0 version and I think that's the point.