Traceback (most recent call last):
File "/fsx/mohamed/dev/marlin/test.py", line 155, in test_groups
self.run_problem(m, n, k, *thread_shape, groupsize)
File "/fsx/mohamed/dev/marlin/test.py", line 66, in run_problem
torch.cuda.synchronize()
File "/admin/home/mohamed_mekkouri/miniconda3/envs/exp/lib/python3.10/site-packages/torch/cuda/init.py", line 792, in synchronize
return torch._C._cuda_synchronize()
RuntimeError: CUDA error: an illegal instruction was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Traceback (most recent call last):
File "/fsx/mohamed/dev/marlin/test.py", line 80, in test_k_stages_divisibility
self.run_problem(16, 2 * 256, k, 64, 256)
File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem
A = torch.randn((m, k), dtype=torch.half, device=DEV)
RuntimeError: CUDA error: an illegal instruction was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Traceback (most recent call last):
File "/fsx/mohamed/dev/marlin/test.py", line 75, in test_tiles
self.run_problem(m, 2 * 256, 1024, thread_k, thread_n)
File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem
A = torch.randn((m, k), dtype=torch.half, device=DEV)
RuntimeError: CUDA error: an illegal instruction was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Traceback (most recent call last):
File "/fsx/mohamed/dev/marlin/test.py", line 85, in test_very_few_stages
self.run_problem(16, 2 * 256, k, 64, 256)
File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem
A = torch.randn((m, k), dtype=torch.half, device=DEV)
RuntimeError: CUDA error: an illegal instruction was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Ran 6 tests in 0.794s
FAILED (errors=4)
the stack i am using :
python 3.10.14
torch 2.3.1
cuda_12.1.r12.1
compute_cap 9.0
Hello, When running
python test.py
I get the error :===================================== ERROR: test_groups (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 155, in test_groups self.run_problem(m, n, k, *thread_shape, groupsize) File "/fsx/mohamed/dev/marlin/test.py", line 66, in run_problem torch.cuda.synchronize() File "/admin/home/mohamed_mekkouri/miniconda3/envs/exp/lib/python3.10/site-packages/torch/cuda/init.py", line 792, in synchronize return torch._C._cuda_synchronize() RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
======================================= ERROR: test_k_stages_divisibility (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 80, in test_k_stages_divisibility self.run_problem(16, 2 * 256, k, 64, 256) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
======================================== ERROR: test_tiles (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 75, in test_tiles self.run_problem(m, 2 * 256, 1024, thread_k, thread_n) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
=========================================== ERROR: test_very_few_stages (main.Test)
Traceback (most recent call last): File "/fsx/mohamed/dev/marlin/test.py", line 85, in test_very_few_stages self.run_problem(16, 2 * 256, k, 64, 256) File "/fsx/mohamed/dev/marlin/test.py", line 60, in run_problem A = torch.randn((m, k), dtype=torch.half, device=DEV) RuntimeError: CUDA error: an illegal instruction was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Ran 6 tests in 0.794s
FAILED (errors=4)
the stack i am using : python 3.10.14 torch 2.3.1 cuda_12.1.r12.1 compute_cap 9.0