Closed LonghronShen closed 8 months ago
I am currently travelling and will review this PR later
@LonghronShen It seems not working on my service. nvidia-smi
works correctly in container and nvidia-container-toolkit
shows correctly in my Ubuntu. It means I have installed the environment correctly but the backend uses CPU to work.
Hi @xiongsp , have you tried the docker-compose.yml
file which contains the gpu settings?
Could you post the container log here for diagnostic? Plus, please also post your docker-compose.yml
here~
Hi @LonghronShen , for some reasons I didn't use docker-compose
instead I use docker run --name name -d -v ./models:/models -p 27777:27777 --gpus all images
Here is the log:
==========
== CUDA ==
==========
CUDA Version 11.6.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
*************************
** DEPRECATION NOTICE! **
*************************
THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:27777 (Press CTRL+C to quit)
--- 0.754920482635498 seconds ---
torch found: /usr/local/lib/python3.10/dist-packages/torch/lib
torch set
Strategy Devices: {'cpu'}
state cache enabled
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 0
Loading /models/RWKV-x060-World-3B-v2-20240228-ctx4096.pth ...
Model detected: v6.0
BTW, this is nvidia-smi
in container shows:
root@76f5f4497a74:/app# nvidia-smi
Thu Mar 7 13:16:57 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:2F:00.0 Off | 0 |
| N/A 30C P0 24W / 250W | 13MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-PCIE... Off | 00000000:86:00.0 Off | 0 |
| N/A 28C P0 23W / 250W | 13MiB / 32768MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
Hi @xiongsp ,
According to the log, I think you invoked the switch-model
web api with a wrong parameter.
For a reference, you may try like this:
curl http://127.0.0.1:27777/switch-model -X POST -H "Content-Type: application/json" -d '{"model":"./models/RWKV-x060-World-3B-v2-20240228-ctx4096.pth","strategy":"cuda fp16","customCuda":"true","deploy":"true"}'
Note that the strategy should be cuda.
Thanks @LonghronShen ! It works! It may be difficult to add the curl
into Dockerfile
but a doc may be helpful. Thanks!
Changes
docker-compose.yml
to change the startup mode for the Python backend.One more thing
docker-compose.yml
as a gateway for authentication or something else.FAQ
How to install nvidia-container-toolkit
If you want to use the CUDA strategy in Docker, you should install the nvidia-container-toolkit first. For example, this is the installation script for Ubuntu.