Closed evanshortiss closed 2 weeks ago
Hi @evanshortiss thanks for the report, could you once the inference server is started open the container logs
You can access the corresponding container details page by clicking on the status icon from AI Lab
Ideally, could you try running the nvidia-smi
from inside the container and provide the output ?
Finally, could you also provide the content of the Inspect
tab of the corresponding container.
Thanks !
Okey I tried the latest image on windows and I am not able to uses the GPU as well. The image used has been introduced by https://github.com/containers/podman-desktop-extension-ai-lab/pull/1558. It is ai-lab-playground-chat-cuda.
The previous image was llamacpp_python_cuda which was really old and last published 3 months ago; however this is because this image is deprecated and replaced by llamacpp-python-cuda with -
instead of _
which has been updated 17 days ago.
Here are the result of the tests compiled
Image | GPU enabled |
---|---|
(current) ghcr.io/containers/podman-desktop-extension-ai-lab-playground-images/ai-lab-playground-chat-cuda@sha256:023e07b729aef9d91e75f8bd57f92b3291670bc362e3aed79bff0cd050074eef | 🔴 |
ghcr.io/containers/llamacpp-python-cuda@sha256:c81947504e7e5dcaa106844dd6672c9106b18c619fc4bb727211eb7fb1fe36d7 | 🟢 |
Thanks for looking into this @axel7083 and apologies for the delay on my end. I was on vacation!
Thanks for looking into this @axel7083 and apologies for the delay on my end. I was on vacation!
Np, thanks for reporting !
Bug description
When running a Service/Playground on Windows 11, the UI reports GPU inference, but is actually using the CPU.
I am running Podman 1.12, Podman to 5.2.0, latest AI Lab (1.2.3) extension, and enabled GPU Inference for the extension. It works fine on macOS.
Operating system
Windows 11
Installation Method
Installer from website/GitHub releases
Version
1.12.0
Steps to reproduce
At this point you can interact with the model. GPU inference is reported in the Podman Desktop UI, but CPU is being used with GPU idle.
Relevant log output
No response
Additional context
No response