-
I buld cutlass on RTX3070, with follow cmake:
```
cmake .. -DCUTLASS_NVCC_ARCHS=80 -DCUTLASS_LIBRARY_KERNELS=all
make cutlass_profiler -j16
```
When I run follow cmd:
```
./tools/prof…
-
### Request description
For **split-k matmul** `[(M x N x K), split-k-slices]`, this issue aims to collect information and resolve:
(a) functional verification
(b) performance profiling
We hav…
-
I am using ubuntu 20.0 and meet this...
So the command I use is:
./tricount -f /home/aa/Downloads/SIMT_TC-master/data
Any suggestion? Thanks!!!
-
Hi Gerardo,
How does the GPU miner differ from CPU miner? What part of the code needs to be changed or optimized to enable GPU mining?
Thank you.
Michael
-
OptiX and Radeon Rays have supported motion blur (putting transformation keyframes in the Acceleration Structures and time variables on rays) for the past 3-4 years.
The only way to achieve this in…
-
Hi all!
Slightly weird issue, but I'm trying to gain access to the RT cores available on ARC Alchemist, as well as any future RT enabled devices.
As of now the only "public" way of accessing R…
-
**What is your question?**
I compile the code(examples/06_splitK_gemm) on A100
```c++
/***************************************************************************************************
* Copyri…
-
Hi,
I've tested `depthwise_conv2d + bias` by slightly modifying **46_depthwise_simt_conv2dfprop**, and the output result is correct when compared with pytorch golden. But for case `depthwise_conv2d +…
-
# My setup
- Arch Linux
- AMDGPU PRO drivers
- AMD RX 7900 XTX
# Command
`VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/amd_pro_icd64.json python apps/stable_diffusion/scripts/txt2img.py --preci…
-
### Summary of Problem
This issue is raised when try to fixing `NotGPU.chpl` by replacing `const` with `param` in #22113.
If a function called in GPU locale and this function changes the val…