Open sanbuphy opened 6 months ago
Hi, I have installed ipex-llm follow the docs: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/QLoRA-FineTuning
and i meet the error
found intel-openmp in /root/miniconda3/envs/llm/lib/libiomp5.so The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. found tcmalloc in /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so +++++ Env Variables +++++ Internal: ENABLE_IOMP = 1 ENABLE_GPU = 0 ENABLE_JEMALLOC = 0 ENABLE_TCMALLOC = 1 LIB_DIR = /root/miniconda3/envs/llm/lib BIN_DIR = /root/miniconda3/envs/llm/bin LLM_DIR = /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm Exported: LD_PRELOAD = /root/miniconda3/envs/llm/lib/libiomp5.so /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so OMP_NUM_THREADS = 12 MALLOC_CONF = USE_XETLA = ENABLE_SDP_FUSION = SYCL_CACHE_PERSISTENT = BIGDL_LLM_XMX_DISABLED = SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS = +++++++++++++++++++++++++ Complete. 2024-06-04 20:20:46,549 - WARNING - The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. 2024-06-04 20:20:46,975 - INFO - PyTorch version 2.1.2+cpu available. /root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5176/5176 [00:02<00:00, 2348.96 examples/s] /root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.25it/s] 2024-06-04 20:20:55,720 - INFO - Converting the current model to sym_int4 format...... /root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( 0%| | 0/200 [00:00<?, ?it/s]Traceback (most recent call last): File "/hy-tmp/ipex-llm/python/llm/example/CPU/QLoRA-FineTuning/qlora_finetuning_cpu.py", line 120, in <module> result = trainer.train() ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2902, in training_step loss = self.compute_loss(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2925, in compute_loss outputs = model(**inputs) ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 817, in forward return model_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 805, in __call__ return convert_to_fp32(self.model_forward(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/peft_model.py", line 1129, in forward return self.base_model( ^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 161, in forward return self.model.forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward outputs = self.model( ^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward layer_outputs = decoder_layer( ^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 698, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: c10::BFloat16 instead. 0%| | 0/200 [00:00<?, ?it/s]
but when i update pytorch to lastest version (2.3.0) it work, I don't know what happen , Could you take a look ? Thanks !
This issue is not related to PyTorch. But maybe it's related to the transformer version. Can you provide transformer version installed?
We will also try to reproduce this issue in our local env,
Hi, I have installed ipex-llm follow the docs: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/QLoRA-FineTuning and i meet the error
found intel-openmp in /root/miniconda3/envs/llm/lib/libiomp5.so The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. found tcmalloc in /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so +++++ Env Variables +++++ Internal: ENABLE_IOMP = 1 ENABLE_GPU = 0 ENABLE_JEMALLOC = 0 ENABLE_TCMALLOC = 1 LIB_DIR = /root/miniconda3/envs/llm/lib BIN_DIR = /root/miniconda3/envs/llm/bin LLM_DIR = /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm Exported: LD_PRELOAD = /root/miniconda3/envs/llm/lib/libiomp5.so /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so OMP_NUM_THREADS = 12 MALLOC_CONF = USE_XETLA = ENABLE_SDP_FUSION = SYCL_CACHE_PERSISTENT = BIGDL_LLM_XMX_DISABLED = SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS = +++++++++++++++++++++++++ Complete. 2024-06-04 20:20:46,549 - WARNING - The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. 2024-06-04 20:20:46,975 - INFO - PyTorch version 2.1.2+cpu available. /root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5176/5176 [00:02<00:00, 2348.96 examples/s] /root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.25it/s] 2024-06-04 20:20:55,720 - INFO - Converting the current model to sym_int4 format...... /root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( 0%| | 0/200 [00:00<?, ?it/s]Traceback (most recent call last): File "/hy-tmp/ipex-llm/python/llm/example/CPU/QLoRA-FineTuning/qlora_finetuning_cpu.py", line 120, in <module> result = trainer.train() ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2902, in training_step loss = self.compute_loss(model, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2925, in compute_loss outputs = model(**inputs) ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 817, in forward return model_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 805, in __call__ return convert_to_fp32(self.model_forward(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/peft_model.py", line 1129, in forward return self.base_model( ^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 161, in forward return self.model.forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward outputs = self.model( ^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward layer_outputs = decoder_layer( ^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( ^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 698, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: c10::BFloat16 instead. 0%| | 0/200 [00:00<?, ?it/s]
but when i update pytorch to lastest version (2.3.0) it work, I don't know what happen , Could you take a look ? Thanks !
This issue is not related to PyTorch. But maybe it's related to the transformer version. Can you provide transformer version installed?
We will also try to reproduce this issue in our local env,
hi, my transformer version is 4.36.0; but i just updating the pytorch version and it work
Hi @sanbuphy
Glad to know it works for you.
Can you share pip list
of your env? In case other customers encounter similar problems.
Hi @sanbuphy
Glad to know it works for you.
Can you share
pip list
of your env? In case other customers encounter similar problems.
ok, here is the pip list
, but it's after reinstalling pytorch, i'm not sure if it modify other env package ,maybe it should be check
Package Version
------------------------ --------------
accelerate 0.27.2
aiohttp 3.9.5
aiosignal 1.3.1
attrs 23.2.0
bitsandbytes 0.43.1
certifi 2024.2.2
charset-normalizer 3.3.2
datasets 2.19.1
dill 0.3.8
filelock 3.14.0
frozenlist 1.4.1
fsspec 2024.3.1
huggingface-hub 0.23.2
idna 3.7
intel-openmp 2024.1.2
ipex-llm 2.1.0b20240601
Jinja2 3.1.4
MarkupSafe 2.1.5
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.3
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.5.40
nvidia-nvtx-cu12 12.1.105
packaging 24.0
pandas 2.2.2
peft 0.10.0
pillow 10.2.0
pip 24.0
protobuf 5.27.0
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 16.1.0
pyarrow-hotfix 0.6
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
regex 2024.5.15
requests 2.32.3
safetensors 0.4.3
scipy 1.13.1
sentencepiece 0.2.0
setuptools 69.5.1
six 1.16.0
sympy 1.12.1
tabulate 0.9.0
tokenizers 0.15.2
torch 2.3.0+cpu
torchaudio 2.3.0+cpu
torchvision 0.18.0+cpu
tqdm 4.66.4
transformers 4.38.0
triton 2.3.0
typing_extensions 4.12.1
tzdata 2024.1
urllib3 2.2.1
wheel 0.43.0
xxhash 3.4.1
yarl 1.9.4
Hi, I have installed ipex-llm follow the docs: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/QLoRA-FineTuning
and i meet the error
but when i update pytorch to lastest version (2.3.0) it work, I don't know what happen , Could you take a look ? Thanks !