artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
MIT License
9.96k stars 820 forks source link

[XPU] CUDA error when running on arc770 with Intel extension for pytorch #266

Open delock opened 1 year ago

delock commented 1 year ago

@abhilash1910 is XPU support for qlora still working? I tried to run it on a linux arc770 system at home but got the following error: $ python qlora.py --model_name_or_path facebook/opt-350m Num processes: 1 Process index: 0 Local process index: 0 Device: xpu:0 , _n_gpu=0, cachedsetup_devices=device(type='cpu'), deepspeed_plugin=None) loading base model facebook/opt-350m... /home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py:2193: FutureWarning: The use_auth_token argument is deprecated and will be warnings.warn( Traceback (most recent call last): File "/home/akey/machine_learning/qlora/qlora.py", line 841, in train() File "/home/akey/machine_learning/qlora/qlora.py", line 704, in train model, tokenizer = get_accelerate_model(args, checkpoint_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/akey/machine_learning/qlora/qlora.py", line 311, in get_accelerate_model model = AutoModelForCausalLM.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/modeling_utils.py", line 725, in _load_state_dict_into_meta_model set_module_quantized_tensor_to_device( File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/transformers/utils/bitsandbytes.py", line 109, in set_module_quantized_tensor_to_device new_value = value.to(device) ^^^^^^^^^^^^^^^^ File "/home/akey/anaconda3/envs/lora/lib/python3.11/site-packages/torch/cuda/init.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

My pip list, is there any special package needed for xpu device (lora) 20:15:50|~/machine_learning/qlora$ pip list|grep torch intel-extension-for-pytorch 2.0.110+xpu torch 2.0.1a0+cxx11.abi torchvision 0.15.2a0+cxx11.abi bitsandbytes 0.40.0 transformers 4.31.0 accelerate 0.21.0

abhilash1910 commented 1 year ago

Hi @delock , the bits & bytes quantization support is in progress; without it the quantized compute will not be in xpu device. I will update once it is completed.