tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware
Apache License 2.0
18.49k stars 2.21k forks source link

Load_in_8bit causing issues: Out of memory error with 44Gb VRAM in my GPU or device_map error #604

Open Nimisha-Pabbichetty opened 9 months ago

Nimisha-Pabbichetty commented 9 months ago

I'm able to get the generate.py script working. However, with the finetune.py script I'm facing the following error: image

It seems to be because the load_in_8bit parameter is set to True and it's looking for a quantisation_config,json but if I set it to False then even a GPU with vRAM of 44Gb is not enough to train the model. How do I create the quantisation_config,json? I'm using huggyllama/llama-7b as the base model since the given link for the base model is down. I face the same error when I use baffo32/decapoda-research-llama-7B-hf as the base model.

Any help would be appreciated, thank you!

Minimindy commented 8 months ago

I think is run out off memory, maybe you should try colab or clean up memory for space to load model.