Saved model doubles in size

lxe / llama-tune

LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers

51 stars 7 forks source link

Open lxe opened 1 year ago

lxe commented 1 year ago

I get 2 parts, 13 GB each, while the original 7b is 13 GB in total

manred1997 commented 1 year ago

can you share config training (e.g, batch_size, max_seq_len, ...). How many resources (VRAM, ...) for training ?