edgar971 / open-chat

A self-hosted, offline, ChatGPT-like chatbot with different LLM support. 100% private, with no data leaving your device.
MIT License
64 stars 8 forks source link

Why zero GPU activity? #12

Open zibberzoo opened 9 months ago

zibberzoo commented 9 months ago

I have an Nvidia RTX 5000 that works flawlessly with docker under WSL2 on Windows 11 Pro. However, using the default docker-compose-13b.yml which is set to use Docker-cuda with 64 GPU layers, I don't see any GPU activity -- only high CPU usage (i7-9850H) and extremely slow results. Am I misinterpreting GPU support? Thanks. -Z

sic79 commented 9 months ago

I have an Nvidia RTX 5000 that works flawlessly with docker under WSL2 on Windows 11 Pro. However, using the default docker-compose-13b.yml which is set to use Docker-cuda with 64 GPU layers, I don't see any GPU activity -- only high CPU usage (i7-9850H) and extremely slow results. Am I misinterpreting GPU support? Thanks. -Z

Same issue here, only difference is that I use a Quadro P5000 GPU.

m0d3rnX commented 9 months ago

nvtop even shows a running compute task correctly, it also fills the VRAM, it just has almost none to extremely little GPU-usage while the CPU runs at full throttle.

SurvivaLlama commented 9 months ago

Same here. 4060ti w/16gb vram. No GPU use, just 100% cpu.

TheQuickestFox commented 8 months ago

RTX 3070 same problem - zero GPU usage 100% multi core CPU usage, I am running 5700G so wondered if it was caused by the IGPU but it looks like I'm not alone and it's not just an AMD issue.

fritolays commented 8 months ago

Same thing here with a p100, 100% cpu usage....

TheresaCBE commented 8 months ago

Using RTX 3080 on v1.0.6, CPU also 100%."

MDKAOD commented 5 months ago

+1

Tesla P4 & Quadro P400 Log recognizes the P4, also says it's been set as the primary device. Zero GPU load, 100% CPU when generating responses.

timetheonce commented 5 months ago

same issue for Tesla T4

mattybeens commented 5 months ago

same issue here 3060 12GB

AuraFross commented 4 months ago

I had the same issue, tell I started to add more (Number Of GPU Layers) I started with 64, and started to see more GPU (Quadro RTX 4000) the CPU (AMD 5800x) usage and the LLM moved into my GPU ram. Now I'm at 1024 but 512 worked just as good for me. I'm running unraid. Note: it does still use CPU but nowhere near as long. Hope this Helps