-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…
-
While looking at example 55 (cutlass/examples/55_hopper_mixed_dtype_gemm/55_hopper_int4_bf16_gemm.cu), I was curious whether this modification would be legal:
From:
`using MmaType = cutlass::bfloat16…
-
### To reproduce
I'm getting this error when multiple servers are posting ILP data via UDP port after 2 hours or so.
at io.questdb@8.1.1/io.questdb.cutlass.line.udp.LineUdpParserImpl.switchTa…
-
Is try_wait on barrier_Q similar to barrier_O? Since Q has already set expect_tx, it seems like an additional wait might not be necessary. Am I understanding this correctly?
```
cutlass::C…
-
* The terminal process "/bin/bash '-c', '/usr/local/cuda-12.4/bin/nvcc -g -G -diag-suppress=177 -lineinfo --std=c++17 -arch=sm_75 '-D CUTE_ARCH_LDSM_SM75_ACTIVATED' -o flash_attention_cutlass_standa…
-
Hi all, I just built the latest Flash Attention 3 ([c1d146c](https://github.com/Dao-AILab/flash-attention/commit/c1d146cbd5becd9e33634b1310c2d27a49c7e862)) with the latest CUTLASS release (`v3.5.1`, i…
-
# ❓ Questions and Help
Hello, I am watching fused multi-head attention in 3rdparty/cutlass.
In cutlass/examples, fused multi head attention is upstream to xformers.
And CUTLASS said fused multi h…
-
**Describe the bug**
I am trying to use Cutlass Python and build it from source.
My environment is formed by Ubuntu 18.04, cuda 11.8, GPU Nvidia Tesla V100 volta, python3.10, make 3.19 and GCC versio…
-
I was finally able thanks to recent updates to use CUTLASS to perform a basic 2d row convolution with strided input and output (see #1323)
However, I have understood that 3.6 will push the `cutlass::c…
-
The current project repository **assumes** existing **submodules** directory for all the optional dependencies. The Python installation script executes `checkout_submodules` but that’s again only rele…