-
The following TTGIR currently fails with CUDA_ERROR_ILLEGAL_ADDRESS.
- The configuration used for this test case is `{"block_m":16,"block_n":16,"block_k":16,"split_k":1,"num_stages":2,"num_warps":2…
-
### What happened?
Compilation to llvm-cpu fails with error: One or more operations with large vector sizes (8192 bytes) were found
Input IR:
```
#map = affine_map (d0, d2, d3)>
#map1 = affine_map…
-
when I do `null_inversion.invert()`, following error occurs:
```
Traceback (most recent call last): …
-
**REPRODUCER**
```mlir
func.func @vectorization_test(%extracted_slice : tensor, %arg0: index, %arg2: index, %3: tensor, %4: tensor) -> tensor{
%c0 = arith.constant 0 :index
%8 = linalg.generic {…
-
When attempting to use the code to infer with a 512x320 images + video, I get this error:
```
File "/workspace/ToonCrafter_with_SketchGuidance/cldm/cldm.py", line 339, in forward
h += gui…
-
Using this issue to note down internal discussion about conv performance.
### Problem
1x1 filter convolutions get converted to `linalg.generic` ops during the global optimization pipeline. For exam…
-
### 🐛 Describe the bug
Today at https://github.com/pytorch/pytorch/blob/51e0996d58e6fa40a8d255a26b767c3f3e035943/torch/_dynamo/variables/tensor.py#L589C1-L596C10
we fallthrough and put an arbitrar…
-
We observed good overlap with FSDP + PGLE:
![Bq7PCuqyJbygSuL](https://github.com/user-attachments/assets/0cff27c4-6499-43d0-b436-ef01a2833ae0). Turning on and off PGLE makes a big difference here.
…
-
The most recent llvm integrate, https://github.com/iree-org/iree/pull/18987, introduced a minor regression in SDXL clip dispatch count (1139 ⇾ 1141). I tracked it to https://github.com/llvm/llvm-proje…
-
It seems that ttb.tensor only supports creating a tensor from a numpy array. Is there any reason it cannot create a tensor directly from a Python array?
Right now, the user has to type the followi…