johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

Finetuning CodeLLaMA34B - RuntimeError: The size of tensor a (1024) must match the size of tensor b (8192) #152

Closed juanps90 closed 11 months ago

juanps90 commented 11 months ago

I've successfully finetuned LLaMA v1 up to 34B and LLaMA v2 up to 13B, but can't seem to get CodeLLaMA to work with this repo.

This is the error that I'm stumbling upon:

│ /home/user/alpaca_lora_4bit/monkeypatch/peft_tuners_lora_monkey_patch        │
│ .py:77 in forward                                                            │
│                                                                              │
│    74 │   │   │   │   │   │   )                                              │
│    75 │   │   │   │   │   │   * self.scaling[self.active_adapter]            │
│    76 │   │   │   │   │   )                                                  │
│ ❱  77 │   │   │   │   result += output                                       │
│    78 │   │   │   return result                                              │
│    79 │   │                                                                  │
│    80 │   │   @property                                                      │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The size of tensor a (1024) must match the size of tensor b (8192)
at non-singleton dimension 2

These are the training parameters:

------training------
mbatch_size=1
batch_size=2
gradient_accumulation_steps=2
epochs=3
lr=0.0002
cutoff_len=2048
lora_r=16
lora_alpha=32
lora_dropout=0
val_set_size=0.0
gradient_checkpointing=True
gradient_checkpointing_ratio=1
warmup_steps=50
save_steps=50
save_total_limit=3
logging_steps=10
checkpoint=False
skip=False
world_size=1
ddp=False
device_map='auto'
groupsize=-1
v1=False
backend='cuda'

Hardware is dual RTX 3090's, finetuning is working with LLaMA v2 13B. CodeLLama weights by The-Bloke (https://huggingface.co/TheBloke/CodeLlama-34B-GPTQ)

johnsmith0031 commented 11 months ago

Which version of transformers are you using?

juanps90 commented 11 months ago

Sorry for the delay in the answer:

transformers @ git+https://github.com/huggingface/transformers.git@17a55534f5e5df10ac4804d4270bf6b8cc24998d

juanps90 commented 11 months ago

Updated to latest git commit from transformers, it's training. Thank you!