Open zcwang opened 2 months ago
Arc770 and iGPU can't working on the same env, we are still working on it, related issue: https://github.com/intel-analytics/ipex-llm/issues/10940
But the error is different, should be RuntimeError: could not create a primitive
. This difference may be caused by your different torch version.
Got it! I will remove ARC770 to test my iGPU again in MTL.
BTW I also test the same SW environment in my TGL platform (Corei7-1185G7) and the iGPU indeed works well.
intel_extension_for_pytorch 2.1.20+git0e2bee2
torch 2.1.0.post0+cxx11.abi
torchvision 0.16.0+fbb4cc5
intel-openmp 2024.1.0
openvino 2024.1.0
openvino-telemetry 2024.1.0
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Iris(R) Xe Graphics OpenCL 3.0 NEO [24.13.29138.7]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Iris(R) Xe Graphics 1.3 [1.3.29138]
(llm-test) intel@myDUT:~/work/ipex-llm/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama3$ ONEAPI_DEVICE_SELECTOR=level_zero:0 python ./generate.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --prompt 'History of Intel' --n-predict 64
2024-05-15 09:57:23,463 - INFO - intel_extension_for_pytorch auto imported
/home/intel/anaconda3/envs/llm-test/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 5.16it/s]
2024-05-15 09:57:25,302 - INFO - Converting the current model to sym_int4 format......
/home/intel/anaconda3/envs/llm-test/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
Inference time: 9.984711408615112 s
-------------------- Prompt --------------------
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-------------------- Output (skip_special_tokens=False) -------------------- <|begin_of_text|><|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Intel Corporation is an American multinational corporation that specializes in the design and manufacture of microprocessors, memory chips, and other semiconductor technologies. Here is a brief history of the company:
Early Years (1968-1979)
Intel was founded on July 18, 1968, by Gordon Moore and Robert N
@qiuxin2012 , I appreciate your support.
@qiuxin2012 . I confirmed MTL-H iGPU works well without ARC770 in platform.
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 7 155H OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.13.29138.7]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.29138]
...
(llm) intel@mydevice:~/work/ipex-llm/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama3$ ONEAPI_DEVICE_SELECTOR=level_zero:0 python ./generate.py --repo-id-or-model-path meta-ll
ama/Meta-Llama-3-8B-Instruct --prompt 'History of Intel' --n-predict 64
2024-05-15 10:36:33,547 - INFO - intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 5.48it/s]
2024-05-15 10:36:34,559 - INFO - Converting the current model to sym_int4 format......
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
Inference time: 6.857227563858032 s
-------------------- Prompt --------------------
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-------------------- Output (skip_special_tokens=False) --------------------
<|begin_of_text|><|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The legendary Intel!
Intel Corporation is an American multinational corporation that specializes in the design and manufacture of microprocessors, the "brain" of modern computers. Here's a brief history of the company:
**Early Years (1968-1971)**
Intel was founded on July 18, 1968, by Gordon
LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 155H]
Registry and code: 13 MB
Command: python ./generate.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --prompt History of Intel --n-predict 64
Uptime: 63.459550 s
Hello ipex-llm experts, I suffers issue about Llama-3-8B on MTL-H's iGPU and need any advice from you. :)
It seems to have issue with iGPU in MTL 155H but no issue with ARC770 in Ubuntu 22.04+kernel v6.8.2.
inf
,nan
or element < 0"LIBXSMM_VERSION: main_stable-1.17-3651 (25693763) LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 155H] Registry and code: 13 MB Command: python ./generate.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --prompt History of Intel --n-predict 64 Uptime: 11.134912 s
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000] [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 7 155H OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000] [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.13.29138.7] [opencl:gpu:3] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO [24.13.29138.7] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.29138] [ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.29138]
intel_extension_for_pytorch 2.1.20+git0e2bee2 torch 2.1.0.post0+cxx11.abi torchvision 0.16.0+fbb4cc5 sentence-transformers 2.3.1 transformers 4.37.0 transformers-stream-generator 0.0.5