cutlass Search Results - Githubissues

sgl-project/sglang #1788

[Bug] cutlass group_gemm.initialize failed

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest version. - [ ] 3. Please note that if the bug-related issue y…

senlice updated 1 week ago

NVIDIA/cutlass #1915

[QST] Is cutlass::bfloat16_t x cutlass::int2b_t GEMM possibl…

While looking at example 55 (cutlass/examples/55_hopper_mixed_dtype_gemm/55_hopper_int4_bf16_gemm.cu), I was curious whether this modification would be legal: From: `using MmaType = cutlass::bfloat16…

areddy2022 updated 4 days ago

questdb/questdb #5151

java.lang.NullPointerException Cannot read field "writer" b…

### To reproduce I'm getting this error when multiple servers are posting ILP data via UDP port after 2 hours or so. at io.questdb@8.1.1/io.questdb.cutlass.line.udp.LineUdpParserImpl.switchTa…

simplyrahul1 updated 1 day ago

Dao-AILab/flash-attention #1319

Is try_wait on barrier_Q Similar to barrier_O? Is an Additio…

Is try_wait on barrier_Q similar to barrier_O? Since Q has already set expect_tx, it seems like an additional wait might not be necessary. Am I understanding this correctly? ``` cutlass::C…

ziyuhuang123 updated 3 days ago

66RING/tiny-flash-attention #9

is the cutlass version support on sm75

* The terminal process "/bin/bash '-c', '/usr/local/cuda-12.4/bin/nvcc -g -G -diag-suppress=177 -lineinfo --std=c++17 -arch=sm_75 '-D CUTE_ARCH_LDSM_SM75_ACTIVATED' -o flash_attention_cutlass_standa…

A-transformer updated 3 weeks ago

Dao-AILab/flash-attention #1289

CUTLASS 3.5.1 makes Flash Attention 3 slower?

Hi all, I just built the latest Flash Attention 3 ([c1d146c](https://github.com/Dao-AILab/flash-attention/commit/c1d146cbd5becd9e33634b1310c2d27a49c7e862)) with the latest CUTLASS release (`v3.5.1`, i…

fno2010 updated 2 weeks ago

facebookresearch/xformers #1112

CUTLASS Fused multi head attention

# ❓ Questions and Help Hello, I am watching fused multi-head attention in 3rdparty/cutlass. In cutlass/examples, fused multi head attention is upstream to xformers. And CUTLASS said fused multi h…

yoon5862 updated 1 month ago

NVIDIA/cutlass #1919

[BUG] Cutlass python does not detect GPU

**Describe the bug** I am trying to use Cutlass Python and build it from source. My environment is formed by Ubuntu 18.04, cuda 11.8, GPU Nvidia Tesla V100 volta, python3.10, make 3.19 and GCC versio…

IzanCatalan updated 2 days ago

NVIDIA/cutlass #1911

[DOC]Need doc to migrate from cutlass::conv::kernel::Default…

I was finally able thanks to recent updates to use CUTLASS to perform a basic 2d row convolution with strided input and output (see #1323) However, I have understood that 3.6 will push the `cutlass::c…

chacha21 updated 2 days ago

pytorch-labs/tritonbench #17

[Installation][non-reproducible]: Op Flash Attention

The current project repository **assumes** existing **submodules** directory for all the optional dependencies. The Python installation script executes `checkout_submodules` but that’s again only rele…

antferdom updated 22 hours ago

1000+ results for cutlass

1000+ results
for cutlass