-
Hi, I've just created a small project ([link to the project](https://github.com/Yanksi/cute_mma)) by modifying the `sgemm_sm80` example. What I was doing was trying to make use of the tensor cores for…
-
**What is your question?**
The data in global memory are stored in int8 format.
I want to use TMA to directly load it from `gmem`, then casting the int8 data into fp16 before saving fp16 data to `sme…
-
**What is your question?**
Is there a way to use TMA / CuTe to perform that following load pattern:
Starting with this 1-D tensor,
`[1, 2, 3, 4]`
I'd like to load the following 2-D tensor …
-
On Windows, this statement
```c++
std::cout
-
### Describe the bug
https://ooooooooo.ooo/static/?1f98c7b4-2676-45f2-8e2a-52458d0e22f8
Game gets stick on the preloader and never moves past 1%, tested on my S22 and Steam Deck.
[ooooooooo.ooo-1…
-
**Describe the bug**
Some files are missing the headers that they rely on, which means they cannot be included by themselves. This is "hidden" in most of the examples because they import many things a…
-
![image](https://github.com/71zenith/nix-dots/assets/44473782/69adc12d-4dc4-482b-9724-8212fad00c7e)
-
* The terminal process "/bin/bash '-c', '/usr/local/cuda-12.4/bin/nvcc -g -G -diag-suppress=177 -lineinfo --std=c++17 -arch=sm_75 '-D CUTE_ARCH_LDSM_SM75_ACTIVATED' -o flash_attention_cutlass_standa…
-
I need @mariechill and @alexjensen-NOAA to add an image of a cute pet to [my file](github-clinic/owen.md)
- [x] Marie's pic added
- [ ] Alex's pic added
-
### 🐛 Describe the bug
```
(conda_env) lcw@cr1-p548xlarge-19:~$ PYTORCH_NO_CUDA_MEMORY_CACHING=1 compute-sanitizer --print-limit=1 --num-callers-host=10 ipython
========= COMPUTE-SANITIZER
Pytho…