Open 88plug opened 2 weeks ago
./local.sh build created it - however, this wasn't clear in the instructions.
I was able to build successfully using Ubuntu 24 -
Still having issues since there are no version requirements or specifics - I have cuda/nvcc installed but get
Running test(s)... ✗ CUDA is not available. Expected True but got [cuGetProcAddress: Mapped symbol 'cuGraphExecGetFlags' to function: 0x7fff1fafb778]. ✗ Tensor failed. Got [cuGetProcAddress: Mapped symbol 'cuGraphExecGetFlags' to function: 0x7fff13ad8dd8]. ✗ Tensor failed. Got [cuGetProcAddress: Mapped symbol 'cuGraphExecGetFlags' to function: 0x7ffc823b0ff8].
OS: Ubuntu 24.04.1 LTS x86_64
Kernel: 6.8.0-45-generic
Terminal: /dev/pts/0
CPU: AMD EPYC-Rome (256) @ 2.249GHz
GPU: NVIDIA RTX 4000 Ada Generation
Memory: 6647MiB / 108602MiB
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Sep_12_02:18:05_PDT_2024
Cuda compilation tools, release 12.6, V12.6.77
Build cuda_12.6.r12.6/compiler.34841621_0
Love this project and trying to get a Dockerfile.client and Dockerfile.server working!
Thanks! Apologies for the light documentation and I appreciate you still trying :)
Which GPU model are you using? We have been testing with a 4090, wonder if there are differences...
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX 4000 Ada Gene... Off | 00000000:01:00.0 Off | Off |
| 30% 39C P8 14W / 130W | 2MiB / 20475MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA RTX 4000 Ada Gene... Off | 00000000:02:00.0 Off | Off |
| 30% 37C P8 12W / 130W | 2MiB / 20475MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
Have you tested or thought through multiple GPU?
The code today does work with multiple GPUs on the same host. There is a plan for supporting multiple GPUs across separate hosts but progress hasn't started on that at all. What happens if you run something like:
./local.sh build && SCUDA_SERVER=127.0.0.1 LD_PRELOAD=$(pwd)/libscuda.so nvidia-smi
replacing 127.0.0.1
with the IP of your ./local.sh server
instance?
Trying to compile the Dockerfile from source and stuck on
/opt/cuda/bin/nvcc -shared -o libscuda.so client.c
Where is client.c?