Closed noofaq closed 4 years ago
@wanfuse maybeMaybe
I downgraded as part of recovery from 20226 back to 20221 without any problem. Everything worked as before the downgrade and wsl2 started working well.
Just to be on the safe side I made a full system backup beforehand.
If you have your pc for some time and "passed through" 20221 try several cycles or recovery to previous windows version.
If you never had it, e.g. your pc is newer than 2 weeks, you can install 20221 from the iso version keeping all your files. This is a safe procedure.
Nevertheless do not skip the backup. (Both Acronis and EaseUS Todo Backup work fine for me in making and restoring disk clones including all partitions).
You can get the 20221.1000 iso image in one of the links mentioned in previous messages.
Good luck, Mickey
Build 20236 blog post annonuncement
We fixed a regression that was breaking NVIDIA CUDA vGPU acceleration in the Windows Subsystem for Linux. Please see this GitHub thread for full details.
Build 20236 fixed it for me 👍
Can confirm that it works on 20236
Wait I received the new update too gotta check it out
Cuda on WSL2 works perfectly on build 20236. The problem seems to be resolved.
Works for me too:
$ python
import tensorflow as tf tf.test.is_gpu_available() ... 2020-10-15 00:53:25.186406: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5 True
$ docker stats $ docker stop # dockers reported by stats $ rm ~/.docker/config.json $ sudo service docker stop Docker already stopped - file /var/run/docker-ssd.pid not found. $ sudo service docker start
Windowed mode Simulation data stored in video memory Single precision floating point simulation 1 Devices used for simulation MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5
Compute 7.5 CUDA device: [GeForce RTX 2080 Ti] 69632 bodies, total time for 10 iterations: 117.762 ms = 411.730 billion interactions per second = 8234.591 single-precision GFLOP/s at 20 flops per interaction
I confirm the same issue exists in 20231.1000
I confirm the issue is resolved in 20236.1000. Thanks to all who contributed momentum toward this.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare -fp64 --works so it seems with output like so
Run "nbody -benchmark [-numbodies=
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Windowed mode Simulation data stored in video memory Double precision floating point simulation 1 Devices used for simulation GPU Device 0: "GeForce GTX 1050 Ti" with compute capability 6.1
Compute 6.1 CUDA device: [GeForce GTX 1050 Ti] 4096 bodies, total time for 10 iterations: 115.717 ms = 1.450 billion interactions per second = 43.495 double-precision GFLOP/s at 30 flops per interaction
while this container fails still with the error
docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" nricklin/ubuntu-gpu-test modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.19.128-microsoft-standard/modules.dep.bin' test.cu(29) : cudaSafeCall() Runtime API error : no CUDA-capable device is detected.
This is with the latest 20231.1000 installed
Lastly the container mentioned below which seems like a very good test shows errors but I think that a kernel compile option that I am unaware of will take care of it. I will investigate it further.
I am posting this second run line for others to use for testing the video card, since nvidia-smi is still not working and everything seems to suggest using nvidia-smi in the containers for examples. (Can't wait for the nvidia-smi thing to be fixed)
Anway here is the command
docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" tensorflow/tensorflow:1.13.2-gpu-py3-jupyter python /app/benchmark.py gpu 10000 (note: I am chaining the nvidia runtime container to a second container) I don't see very many mentions of how to do this without docker-compose. (Helpful?)
it produces output like
Found device 0 with properties: name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392 pciBusID: 0000:01:00.0 totalMemory: 4.00GiB freeMemory: 3.30GiB 2020-10-15 02:16:20.656683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2020-10-15 02:16:20.658946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-10-15 02:16:20.659002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2020-10-15 02:16:20.659019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2020-10-15 02:16:20.659493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
As you can see there seems to be NUMA missing, this might not be a kernel compile problem but rather something with tensorflow versional problems like mentioned here:
https://stackoverflow.com/questions/55511186/could-not-identify-numa-node-of-platform-gpu (at the very bottom)
I am still investigating. Hope these test containers help someone else.
I’m still running into the all CUDA-capable devices are busy or unavailable
error after updating to 20236 with CUDA 460.15 on Ubuntu 20.04.
torch.cuda.is_available()
returns True
but I can’t move any variables to the device.
docker-compose does not support gpu docker containers. In order to use docker containers:
version: "3.7"
...
device_requests:
- capabilities:
- "gpu"
You may get an error "docker.errors.InvalidVersion: device_requests param is not supported in API versions < 1.40" which is fixed by setting environment variable COMPOSE_API_VERSION=1.40
With this modified docker-compose, gpu parameter will be passed to the docker container.
It is not a Microsoft problem but a docker-composer problem. With the modified docker-compose it works well.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare -fp64 --works so it seems with output like so Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy=
(load a tipsy model file for simulation) NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Windowed mode Simulation data stored in video memory Double precision floating point simulation 1 Devices used for simulation GPU Device 0: "GeForce GTX 1050 Ti" with compute capability 6.1
Compute 6.1 CUDA device: [GeForce GTX 1050 Ti] 4096 bodies, total time for 10 iterations: 115.717 ms = 1.450 billion interactions per second = 43.495 double-precision GFLOP/s at 30 flops per interaction
while this container fails still with the error
docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" nricklin/ubuntu-gpu-test modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.19.128-microsoft-standard/modules.dep.bin' test.cu(29) : cudaSafeCall() Runtime API error : no CUDA-capable device is detected.
This is with the latest 20231.1000 installed
Lastly the container mentioned below which seems like a very good test shows errors but I think that a kernel compile option that I am unaware of will take care of it. I will investigate it further.
I am posting this second run line for others to use for testing the video card, since nvidia-smi is still not working and everything seems to suggest using nvidia-smi in the containers for examples. (Can't wait for the nvidia-smi thing to be fixed)
Anway here is the command
docker run --runtime=nvidia --rm -ti -v "${PWD}:/app" tensorflow/tensorflow:1.13.2-gpu-py3-jupyter python /app/benchmark.py gpu 10000 (note: I am chaining the nvidia runtime container to a second container) I don't see very many mentions of how to do this without docker-compose. (Helpful?)
it produces output like
Found device 0 with properties: name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392 pciBusID: 0000:01:00.0 totalMemory: 4.00GiB freeMemory: 3.30GiB 2020-10-15 02:16:20.656683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2020-10-15 02:16:20.658946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-10-15 02:16:20.659002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2020-10-15 02:16:20.659019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2020-10-15 02:16:20.659493: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
As you can see there seems to be NUMA missing, this might not be a kernel compile problem but rather something with tensorflow versional problems like mentioned here:
https://stackoverflow.com/questions/55511186/could-not-identify-numa-node-of-platform-gpu (at the very bottom)
I am still investigating. Hope these test containers help someone else.
I had same trouble til 20231. CUDA might not work with build 20231, try it with build 20236
In my case, it works well after the update
It is also fixed for me in 20236 so I am closing the issue. Thanks to all commenters and also WSL developers for fixing the issue.
Yes now cuda on wsl is working in 20236. I am creating a restore point if this issue arises again in future.
I'm using 20241, but cuda is still not working in mu wsl. Previously, when I use 20176, it worked
import torch torch.cuda.current_device() Traceback (most recent call last): File "
", line 1, in File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 377, in current_device _lazy_init() File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init _check_driver() File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 101, in _checkdriver http://www.nvidia.com/Download/index.aspx""") AssertionError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
I'm using 20241, but cuda is still not working in mu wsl. Previously, when I use 20176, it worked
import torch torch.cuda.current_device() Traceback (most recent call last): File "", line 1, in File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 377, in current_device _lazy_init() File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init _check_driver() File "/home/xiaothan/miniconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/torch/cuda/init.py", line 101, in _checkdriver http://www.nvidia.com/Download/index.aspx""") AssertionError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
I think they broke WSL-CUDA again... It did not work for me either. I tried reinstalling WSL, CUDA and NVIDIA driver. But WSL just failed to identify both NVIDIA driver and CUDA.
I have just updated to 20241 and
For me there is no problem with 20241. Best, Mickey
$ docker ps
$ docker stop # dockers reported by ps
$ rm ~/.docker/config.json
$ sudo service docker stop
$ sudo service docker start
$ sudo mkdir /sys/fs/cgroup/systemd
$ sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
$ docker ps
$ docker run hello-world
$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM
GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5
> Compute 7.5 CUDA device: [GeForce RTX 2080 Ti]
69632 bodies, total time for 10 iterations: 118.318 ms
= 409.797 billion interactions per second
= 8195.939 single-precision GFLOP/s at 20 flops per interaction
The nvidia docker works fine.
And so does Tensorflow
$ python
> import tensorflow as tf
> tf.test.is_gpu_available()
Your kernel may have been built without NUMA support.
2020-10-25 00:21:10.409370: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555df5500de0 executing computations on platform CUDA. Devices:
2020-10-25 00:21:10.409404: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
True
> tf.test.gpu_device_name()
2020-10-25 00:22:14.701525: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 9630 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:09:00.0, compute capability: 7.5)
'/device:GPU:0'
Works for me, bu the way i reinstall the Nvidia driver 460 everytime after an update
Ok! Solved it! Sorry if I missguided anybody. 😉 My NVIDIA driver and Ubuntu distro got messed up after the update. Re-installing Nvidia and repairing and Ubuntu did the trick:
user@DESKTOP:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Pascal" with compute capability 6.1
Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.
Executing Black-Scholes GPU kernel (512 iterations)...
Options count : 8000000
BlackScholesGPU() time : 1.083354 msec
Effective memory bandwidth: 73.844778 GB/s
Gigaoptions per second : 7.384478
BlackScholes, Throughput = 7.3845 GOptions/s, Time = 0.00108 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128
Reading back GPU results...
Checking the results...
...running CPU calculations.
Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05
Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.
[BlackScholes] - Test Summary
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Test passed
Can you paste the contents of black Scholes?
Seems like a good test script!
Thanks!
Ok! Solved it! Sorry if I missguided anybody. 😉 My NVIDIA driver and Ubuntu distro got messed up after the update. Re-installing Nvidia and repairing and Ubuntu did the trick:
user@DESKTOP:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes [./BlackScholes] - Starting... GPU Device 0: "Pascal" with compute capability 6.1 Initializing data... ...allocating CPU memory for options. ...allocating GPU memory for options. ...generating input data in CPU mem. ...copying input data to GPU mem. Data init done. Executing Black-Scholes GPU kernel (512 iterations)... Options count : 8000000 BlackScholesGPU() time : 1.083354 msec Effective memory bandwidth: 73.844778 GB/s Gigaoptions per second : 7.384478 BlackScholes, Throughput = 7.3845 GOptions/s, Time = 0.00108 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128 Reading back GPU results... Checking the results... ...running CPU calculations. Comparing the results... L1 norm: 1.741792E-07 Max absolute error: 1.192093E-05 Shutting down... ...releasing GPU memory. ...releasing CPU memory. Shutdown done. [BlackScholes] - Test Summary NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. Test passed
Which Nividia driver are you using? My Nividia driver version is 460.20 and Ubuntu 20.04, it still didn't work
@wanfuse123: NVIDIA CUDA examples: https://docs.nvidia.com/cuda/wsl-user-guide/index.html#unique_626646811 https://github.com/NVIDIA/cuda-samples
Ok! Solved it! Sorry if I missguided anybody. 😉 My NVIDIA driver and Ubuntu distro got messed up after the update. Re-installing Nvidia and repairing and Ubuntu did the trick:
user@DESKTOP:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes [./BlackScholes] - Starting... GPU Device 0: "Pascal" with compute capability 6.1 Initializing data... ...allocating CPU memory for options. ...allocating GPU memory for options. ...generating input data in CPU mem. ...copying input data to GPU mem. Data init done. Executing Black-Scholes GPU kernel (512 iterations)... Options count : 8000000 BlackScholesGPU() time : 1.083354 msec Effective memory bandwidth: 73.844778 GB/s Gigaoptions per second : 7.384478 BlackScholes, Throughput = 7.3845 GOptions/s, Time = 0.00108 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128 Reading back GPU results... Checking the results... ...running CPU calculations. Comparing the results... L1 norm: 1.741792E-07 Max absolute error: 1.192093E-05 Shutting down... ...releasing GPU memory. ...releasing CPU memory. Shutdown done. [BlackScholes] - Test Summary NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. Test passed
Which Nividia driver are you using? My Nividia driver version is 460.20 and Ubuntu 20.04, it still didn't work
Reinstall Nvidia and Ubuntu, that worked for me with 20241
Ok! Solved it! Sorry if I missguided anybody. 😉 My NVIDIA driver and Ubuntu distro got messed up after the update. Re-installing Nvidia and repairing and Ubuntu did the trick:
user@DESKTOP:/usr/local/cuda/samples/4_Finance/BlackScholes$ ./BlackScholes [./BlackScholes] - Starting... GPU Device 0: "Pascal" with compute capability 6.1 Initializing data... ...allocating CPU memory for options. ...allocating GPU memory for options. ...generating input data in CPU mem. ...copying input data to GPU mem. Data init done. Executing Black-Scholes GPU kernel (512 iterations)... Options count : 8000000 BlackScholesGPU() time : 1.083354 msec Effective memory bandwidth: 73.844778 GB/s Gigaoptions per second : 7.384478 BlackScholes, Throughput = 7.3845 GOptions/s, Time = 0.00108 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128 Reading back GPU results... Checking the results... ...running CPU calculations. Comparing the results... L1 norm: 1.741792E-07 Max absolute error: 1.192093E-05 Shutting down... ...releasing GPU memory. ...releasing CPU memory. Shutdown done. [BlackScholes] - Test Summary NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. Test passed
Which Nividia driver are you using? My Nividia driver version is 460.20 and Ubuntu 20.04, it still didn't work
Reinstall Nvidia and Ubuntu, that worked for me with 20241
Reinstall NV driver(460.20_gameready_win10-dch_64bit_international) and restart win10. The problem has resovled. win10:20246.1;wsl2:Ubuntu 20.04
NV driver 460.20 win10:20246.1;wsl2:Ubuntu 18.04
Today I am getting: $ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. ERRO[0001] error waiting for container: context canceled
Reboots do not help. Reinstalled https://developer.nvidia.com/46020-gameready-win10-dch-64bit-international.exe
Did not help.
Any idea ?
@tadam98 what's the output of ls -la /dev/dxg
from bash?
My whole computer crashed and windows insider could not boot. So it is back in the shop for a short while getting a disk massage or something.
Rebuilt my computer. Now the Windows insider version is 20251.1 (downgrade to 2.3.0 did not help)
On the windows host docker 2.5.0.0 was installed.
$ docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
For some reason:
$ sudo service docker stop
docker: unrecognized service
It was not so in my previous system.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled
And, as requested:
~$ ls -la /dev/dxg
crw-rw-rw- 1 root root 245, 0 Nov 8 05:49 /dev/dxg
@tadam98 That docker you are running is probably the Docker Desktop for Windows which doesn't support GPU in WSL2 yet. What's the output of ls -la $(which docker)
?
-rwxr-xr-x 1 root root 89213816 Oct 14 19:52 /usr/bin/docker
docker --version
Docker version 19.03.6, build 369ce74a3c
python
import tensorflow as tf
tf.__version__
'1.14.0'
tf.test.is_gpu_available()
2020-11-08 14:32:14.364926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65
pciBusID: 0000:0a:00.0
True
tf.test.gpu_device_name()
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.65
pciBusID: 0000:0a:00.0
2020-11-08 14:32:25.081122: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 9630 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:0a:00.0, compute capability: 7.5)
'/device:GPU:0'
As can be seen, the GPU us seen in wsl2. I just have the docker problem.
@tadam98 Since that looks like a nvidia docker bug you should post the issue here.
Hey, running on winver 20251.1 and driver 460.20. Following the guide on Cuda on WSL:
/usr/local/cuda/samples/4_Finance/BlackScholes$ sudo make
/usr/local/cuda-11.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -maxrregcount=16 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o BlackScholes.o -c BlackScholes.cu nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). ptxas warning : For profile sm_70 adjusting per thread register count of 16 to lower bound of 24 ptxas warning : For profile sm_75 adjusting per thread register count of 16 to lower bound of 24 ptxas warning : For profile sm_80 adjusting per thread register count of 16 to lower bound of 24 /usr/local/cuda-11.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -maxrregcount=16 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o BlackScholes_gold.o -c BlackScholes_gold.cpp nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). /usr/local/cuda-11.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o BlackScholes BlackScholes.o BlackScholes_gold.o nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). mkdir -p ../../bin/x86_64/linux/release cp BlackScholes ../../bin/x86_64/linux/release
/usr/local/cuda/samples/4_Finance/BlackScholes$ sudo ./BlackScholes
[./BlackScholes] - Starting...
CUDA error at ../../common/inc/helper_cuda.h:777 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)"
Any ideias on how to solve this?
If both run you are usually fine.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
And,
python
import tensorflow as tf
tf.__version__
tf.test.is_gpu_available()
tf.test.gpu_device_name()
In my case the docker fails and the python works. You may need to alter the python commands slightly if the tf version is different.
Microsoft just released 20257.1 any opinions ?
I have 20257 and getting cudaErrorInsufficientDriver error
This makes it work. Basically install docker on wsl2. https://dev.to/bartr/install-docker-on-windows-subsystem-for-linux-v2-ubuntu-5dl7
After restart of wsl2 do the following procedure:
$ docker ps
$ docker ps -aq
$ docker stop $(docker ps -aq)
$ docker stop # dockers reported by ps
$ rm ~/.docker/config.json
$ sudo service docker stop
$ sudo service docker start
$ sudo mkdir /sys/fs/cgroup/systemd
$ sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
$ docker ps
$ docker run hello-world
$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5
> Compute 7.5 CUDA device: [GeForce RTX 2080 Ti]
69632 bodies, total time for 10 iterations: 117.583 ms
= 412.357 billion interactions per second
= 8247.147 single-precision GFLOP/s at 20 flops per interaction
One more thing you need to know. The usual docker-compose (last one is version 1.27) that gets installed with the docker does not support starting dockers with gpu. The only way to manage docker containers with gpu is using an experimental version of docker-compose that does support management of docker containers that use a gpu.
$ which docker-compose
$ sudo rm rf <folder of docker compose>
$ pip install git+https://github.com/yoanisgil/compose.git@device-requests
$ docker-compose --version
docker-compose version 1.26.0dev, build unknown
# to use it you have to make sure that API version is 1.40. This can be done on the same command-line:
$ COMPOSE_API_VERSION=1.40 docker-compose up
NVIDIA_DRIVER_PRESENT: "True"
container_name: docker_name_needing_gpu
device_requests:
- capabilities:
- "gpu"
Windows insider updated to 20262.1 With the update, the regular nvidia driver was re-installed. Not good for wsl2. Also, nvidia issued a new driver that you should install in windows: https://developer.nvidia.com/cuda/wsl The new drive is 465.12 Download and install it. You must reboot after the install or wsl2 will not see the gpu as of yet. Start wsl2
$ python
> import tensorflow as tf
> tf.test.is_gpu_available()
2020-11-19 17:15:13.395676: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
True
Then do the above docker procedures to check that the docker aslo works. In my case, it works fine.
Windows insider updated to 20262.1 With the update, the regular nvidia driver was re-installed. Not good for wsl2. Also, nvidia issued a new driver that you should install in windows: https://developer.nvidia.com/cuda/wsl The new drive is 465.12 Download and install it. You must reboot after the install or wsl2 will not see the gpu as of yet. Start wsl2
$ python > import tensorflow as tf > tf.test.is_gpu_available() 2020-11-19 17:15:13.395676: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5 True
Then do the above docker procedures to check that the docker aslo works. In my case, it works fine.
After doing this, it still didn't work for me.
I was able to get it to work by following it up with this:
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get update
sudo apt-get install -y cuda-toolkit-11-1
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda config --set auto_activate_base false
conda create --name directml python=3.6
conda activate directml
pip install tensorflow-directml
import tensorflow.compat.v1 as tf
tf.enable_eager_execution(tf.ConfigProto(log_device_placement=True))
print(tf.add([1.0, 2.0], [3.0, 4.0]))
And there, I finally saw the name of my GPU pop up!
After that I went back and tried the ./BlackScholes test:
cd /usr/local/cuda/samples/4_Finance/BlackScholes
sudo make
./BlackScholes
And it also worked!
Hi @serge,
When you said "After doing this, it still didn't work for me." did you also follow the instruction on the docker one post above? (I am working with Cuda 10.0 and it is working fine (with other examples as 10.0 does not have the BlackScholes. Actually cuda version does not matter at all. Once Python and nvidia docker container see the GPU the rest is details)
Best, Mickey
Hi @tadam98 , no I didn't try that.
@serg06 Interesting.
I see that you are not using the nvidia docker container. The procedure is not needed if you do not use it and do not use docker containers that needד a gpu.
Are you using the Desktop docker exclusively without the addiaional installation of the Ubuntu Docker as in post?
Just at the tensorflow level, tensorflow-gpu works fine form me, without the need for the tensorflow-directml. Since I am still using version 1.14, I had to install cuda 10.x and cudnn 7.6.5 which work very fine.
Best, Mickey
I just updated to 20262.1010
and reinstalled nvidia driver 465.12
, and installed Ubuntu 20.04 WSL2, and setup accordingly (used the CUDA WSL2 tutorial and replaced 1804 with 2004), and it just worked ! and even my previous Ubuntu 18.04 started working for CUDA (it wasn't working on older build), Thanks to @serg06 !
How can I track this version is merged to the mainline? I don't want to run insider build :)
Oh, if you wait another 3-6 months it may be integrated into the main builds.
Windows insider updated to 20262.1 With the update, the regular nvidia driver was re-installed. Not good for wsl2. Also, nvidia issued a new driver that you should install in windows: https://developer.nvidia.com/cuda/wsl The new drive is 465.12 Download and install it. You must reboot after the install or wsl2 will not see the gpu as of yet. Start wsl2
$ python > import tensorflow as tf > tf.test.is_gpu_available() 2020-11-19 17:15:13.395676: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5 True
Then do the above docker procedures to check that the docker aslo works. In my case, it works fine.
After doing this, it still didn't work for me.
I was able to get it to work by following it up with this:
- Installed everything else that Windows Update wanted to install and rebooted
Reinstalled the 465.12 driver
- On the first screen, I selected "Drivers + GeForce Experience"
- On the second screen, I selected Custom, then pressed Next and checked "Perform a clean install"
- Uninstalled all my existing Ubuntu installations
- Restarted Windows
- Installed Ubuntu from Microsoft store (the one called "Ubuntu" with no version) (it installed Ubuntu 20.04)
- Restarted Windows
Followed these instructions to add the sources to my sources lists, but I modified the URLs to be 2004 instead of 1804:
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get update
Continued to follow the instructions by installing Nvidia Toolkit 11.1 (not 11.0)
sudo apt-get install -y cuda-toolkit-11-1
Followed these instructions to test the installation
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
When it asks whether to init conda, hit yes
restart terminal
conda config --set auto_activate_base false
restart terminal
conda create --name directml python=3.6
conda activate directml
pip install tensorflow-directml
Then in Python
import tensorflow.compat.v1 as tf
tf.enable_eager_execution(tf.ConfigProto(log_device_placement=True))
print(tf.add([1.0, 2.0], [3.0, 4.0]))
And there, I finally saw the name of my GPU pop up!
After that I went back and tried the ./BlackScholes test:
cd /usr/local/cuda/samples/4_Finance/BlackScholes
sudo make
./BlackScholes
And it also worked!
Thanks for posting your steps. I also run Ubuntu in WSL2 and use the Nvidia docker containers as explained in https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-nvidia-drivers I also faced the same problem (nvcr.io/nvidia/tensorflow not working in WSL 2 after the update of Windows Insider version).
From your steps, I only had to re-install the CUDA driver ( https://developer.nvidia.com/cuda/wsl ). In my case I just had to run the installer like I normally did. I didn't even have to restart WSL.
I ran the benchmark container to check if everything works:
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
The benchmark passed and my nvcr.io/nvidia/tensorflow container works again.
I can confirm that installing the latest docker edge, Microsoft Edge packages, and NVIDIA 465.12 makes it all work again.
I had stopped working on this for a month because an update killed it,it was badly broken.
Fortunately it is working again.
Even better than before actually because now NVIDIA DOCKER can run in multiple containers simultaneously!!!
also can confirm that nvidia-smi seems to at least partially work once again!
Updated to Insider version 20270.1 still with 465.12 and all is working well as before.
@ArieTwigt Not sure how it worked for you as you did not include "experimental" in the .list as in the instructions.
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
Environment
Steps to reproduce
Exactly followed instructions available at https://docs.nvidia.com/cuda/wsl-user-guide/index.html Tested on previously working Ubuntu WSL image (IIRC GPU last worked on 20206, than whole WSL2 stopped working) Tested also on newly created Ubuntu 18.04 and Ubuntu 20.04 images.
I have tested CUDA compatible NVIDIA drivers 455.41 & 460.20. I have tried removing all drivers etc. I have also tested using CUDA 10.2 & CUDA 11.0.
It was tested on two separate machines (one Intel + GTX1060, other Ryzen + RTX 2080Ti)
Issue tested directly in OS also in docker containers inside.
Example (directly in Ubuntu):
Example in container:
Expected behavior
CUDA working inside WSL2
Actual behavior
All tests which are using CUDA inside WSL Ubuntu are resulting with various CUDA errors - mostly referring to no CUDA devices available.