Hi, after I tuning LLaMA2 with the script python lora_finetune_wn11.py, I run python lora_infer_wn11.py, and the issue happened as following:
Could not find the bitsandbytes CUDA binary at PosixPath('/data/ChenWei/miniconda3/envs/graphgpt/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda113.so')
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████| 2/2 [00:22<00:00, 11.44s/it]
Traceback (most recent call last):
File "lora_infer_wn11.py", line 37, in <module>
model = PeftModel.from_pretrained(
File "/data/ChenWei/miniconda3/envs/graphgpt/lib/python3.8/site-packages/peft/peft_model.py", line 271, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/data/ChenWei/miniconda3/envs/graphgpt/lib/python3.8/site-packages/peft/peft_model.py", line 554, in load_adapter
adapters_weights = safe_load_file(filename, device="cuda" if torch.cuda.is_available() else "cpu")
File "/data/ChenWei/miniconda3/envs/graphgpt/lib/python3.8/site-packages/safetensors/torch.py", line 308, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
Hi, after I tuning LLaMA2 with the script
python lora_finetune_wn11.py
, I runpython lora_infer_wn11.py
, and the issue happened as following: