Describe the bug
When I run a Taichi kernel with Nvidia CUDA, it always works well. However, when I try it with AMD Vulkan, it will fail.
To Reproduce
Here is a sample code.
import time
import numpy as np
import taichi as ti
ti.init(arch=ti.gpu)
@ti.kernel # Test kernel
def test_kernel(a: ti.types.ndarray(ndim=3), b: ti.types.ndarray(ndim=3)):
nslices, nrow, ncols = a.shape
for sli, row, col in b:
for n in range(nslices):
b[sli, row, col] += a[n, row, col]
b[sli, row, col] /= nslices
m = 64 # Test data size
m = 512
a = np.random.random((m, m, m)).astype('float32')
b = np.zeros_like(a)
t1 = time.time()
test_kernel(a, b)
t2 = time.time()
print(f'Time: {t2-t1}s')
Log/Screenshots
For Nvidia CUDA, here is the result:
When m = 64,
[Taichi] version 1.7.0, llvm 15.0.1, commit 2fd24490, win, python 3.10.10
[Taichi] Starting on arch=cuda
Time: 0.049997806549072266s
When m = 512,
[Taichi] version 1.7.0, llvm 15.0.1, commit 2fd24490, win, python 3.10.10
[Taichi] Starting on arch=cuda
Time: 1.6580820083618164s
For AMD Vulkan, here is the result:
When m = 64, it works.
[Taichi] version 1.7.1, llvm 15.0.1, commit 0f143b2f, win, python 3.10.6
[Taichi] Starting on arch=vulkan
Time: 0.040009260177612305s
However, when m = 512, it fails.
'C:\x5cUsers\x5cMango\x5cDesktop\x5cWFH\x5ctest.py' ;949194d8-2dc7-4409-bbea-16e3abdba9d4[Taichi] version 1.7.1, llvm 15.0.1, comm[W 07/27/24 19:56:04.137 17488] [cuda_driver.cpp:taichi::lang::CUDADriverBase::load_lib@36] nvcuda.dll lib not found.
RHI Error: (0) Vulkan device might be lost (vkQueueSubmit failed)
Assertion failed: false && "Error without return code", file C:\Users\buildbot\actions-runner\_work\taichi\taichi\taichi\rhi\vulkan\vulkan_device.cpp, line 2038
If I try it again, my screen goes black. I have to reboot my computer.
Describe the bug When I run a Taichi kernel with Nvidia CUDA, it always works well. However, when I try it with AMD Vulkan, it will fail.
To Reproduce Here is a sample code.
Log/Screenshots For Nvidia CUDA, here is the result: When
m = 64
,When
m = 512
,For AMD Vulkan, here is the result: When
m = 64
, it works.However, when
m = 512
, it fails.If I try it again, my screen goes black. I have to reboot my computer.
Additional comments Nvidia version: GeForce RTX 2080 SUPER 8GB AMD version: Radeon RX 7700 XT 12 GB