-
Generated Epilogue logging classes unpredictably throw runtime errors.
### Project setup
Robot.java
```java
@Logged
public final class Robot extends TimedRobot {
private final ExampleSub…
BR88C updated
2 weeks ago
-
Now I'm using cutlass in my project. I found that some cases have constraints to the layout, such as input matrix A and output matrix C should be row major. These kinds of assumption limit the feasibi…
-
### System Info
- CPU: x86_64
- GPU: H100
- python: 3.10.14
- cuda: 12.2
- driver: 535.86.10
- torch: 2.4.0
### Who can help?
@ncomly-nvidia
### Information
- [x] The official example scripts
- …
-
**Describe the bug**
Some files are missing the headers that they rely on, which means they cannot be included by themselves. This is "hidden" in most of the examples because they import many things a…
-
**What is your question?**
I'm looking to define a GEMM that does the following (in pseudocode):
```
D = AB + C
F = norm(D, axis=1)
return F
```
That is, the epilogue should a) compute the colu…
-
**Steps/Code to reproduce bug**
```python
import torch
import cutlass.epilogue
def epilogue(accum, bias):
D = accum + bias
return D
examples_tensors = dict(
accum=torch.randn(1024, 102…
-
Go does not currently (as of go1.23) emit epilogue_begin markers around where functions return. It does emit `prologue_end`.
The motivation here is that go is not friendly to things like ebpf retur…
-
The generated epilogue for functions that return structs adds 4 to the stack pointer despite there being no corresponding subtraction in the prologue.
A minimal example for this issue would be
`…
-
Currently, we perform tail merging in `fgAddInternal` phase which is performed right after inlining. There, we minimize the number of tails (BBJ_RETURN) to 4 (I guess it's driven by legacy jit32 gc e…
-
**What is your question?**
I tried example48, and I find that in producer, the epilogue is not used at all!? I am puzzled that what is the function of producer-epilogue.
Thanks