Closed 0x33taji closed 6 months ago
The issue is with the intel driver dkms for arc GPU. Not the model server.
intel-i915-dkms on the 5.15+ kernel with the latest update to 5.15.0-92-generic was reporting Maximum Addressible memory of 248MB instead of 4GB usual.
In other words intel-i915-dkms is broken for Arc GPU as of this time of writing.
Steps to fix issue: apt purge the dkms driver apt install the ubuntu hwe kernel (in the dgpu docs it is stated to use the hwe kernel which currently is 6+ on which dkms do not compile; documentation needs to be fixed)
The tree kernel driver was correctly reporting the Max addressible memory to 4GB again and the issue resolved from the model-server side. However, the tree driver broke xpu-smi (Almost every stat in xpu-smi stats -d 0 says N/A)
So the issue is resolved from the model-server end.
Please forward the issue to the driver team, if possible.
Describe the bug optimum-intel cli converted openvino ir models does not load on the model-server
To Reproduce
Expected behavior My GPU is Arc A770 with 16GB of VRAM. The model should run flawlessly. So, the GPU memory should not be exceeded for microsoft/phi-2 with weights compressed to fp16.
Maybe, I am doing something wrong.