System Info

I am finetuning Llama3-8b-Instruct model. Here is the Jupyter Notebook of the steps, i followed to perform the finetuning:

https://gitlab.com/keerti4p/llama3-8b-instruct-finetune/-/blob/main/llama3_finetune_8b_instruct.ipynb

I am using 'g5.ml.24xlarge' AWS EC2 instance to finetune the llama3 model. However, when i execute the below code snippet:

Set supervised fine-tuning parameters

trainer = SFTTrainer( model=model, train_dataset=dataset, peft_config=peft_config, # use our lora peft config dataset_text_field="text", max_seq_length=None, # no max sequence length tokenizer=tokenizer, # use the llama tokenizer args=training_arguments, # use the training arguments packing=False, # don't need packing )

I am receiving the below error:

RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

Please suggest how to resolve this error.

Reproduction

Instantiate 'g5.ml.24xlarge' AWS EC2 instance through AWS Sagemaker Jupyter Lab space and run the code written in the below Jupyter notebook:

https://gitlab.com/keerti4p/llama3-8b-instruct-finetune/-/blob/main/llama3_finetune_8b_instruct.ipynb

Expected behavior

The 'Llama3-8b-Instruct' is finetuned and the model is pushed to Hugging Face.

bitsandbytes-foundation / bitsandbytes

CUDA Setup failed despite GPU being available #1289

System Info

Set supervised fine-tuning parameters

Reproduction

Expected behavior