-
I have compiled and installed the cutlass dynamic library, generated the libcutlass.so file, and then followed the example( https://github.com/NVIDIA/cutlass/tree/main/examples/60_cutlass_import).
Ru…
-
**Describe the bug**
When i run the depthwise convolution example
`./46_depthwise_simt_conv2dfprop`
, it report a bug:
`Got cutlass error: Error Internal at: 500`
My GPU is Titan Xp, the cuda v…
-
**Describe the bug**
I implemented SmoothQuant INT8 inference for PyTorch with `CUTLASS` INT8 GEMM kernels, which are wrapped as PyTorch modules in [torch-int](https://github.com/Guangxuan-Xiao/torch…
-
```
TaichiCompilationError:
File "C:\Users\bobca\test.py", line 336, in rasterize:
rasterize_count = ti.simt.block.SharedArray(1, ti.i32)
^^^^^^^^^^^^^^^^…
-
**Describe the bug**
When I alter some configurations in example 36_gather_scatter_fusion, specifically using general SiMT instead of tensor OP, the program sometimes illegally access __global__ memo…
-
Hello!
I am seeing following error by using a function that is defined based on the CUTLASS (repeated many lines):
`cutlass/include/cutlass/arch/mma_sm80.h:516: void cutlass::arch::Mma::operator…
-
Minimal reproduction:
```Python
import taichi as ti
block_dim = 64
N=256
ti.init(arch=ti.cuda, print_ir=True, print_kernel_llvm_ir=True)
@ti.kernel
def test(out:ti.types.ndarray()):
t…
-
### What happened?
When compiling [UNET](https://storage.googleapis.com/shark_tank/latest/unet64_512_512_fp16_stabilityai_stable_diffusion_2_1_base/unet64_512_512_fp16_stabilityai_stable_diffusion_2_…
-
### What happened?
All the TensorCore test under `iree/tests/e2e/matmul/` fails in my environment. The SIMT tests have passed.
The environment is:
NVIDIA A10 GPU,
nvidia/cuda:11.6.2-cudnn8-devel…
-
# Future of Numerics and AI
.NET provides a broad range of support for various development domains, ranging from the creation of performance-oriented framework code to the rapid development of clou…