-
### Describe the issue
The GCC version is checked to be at least version 9 in the `CMakeLists.txt`. https://github.com/microsoft/onnxruntime/blob/c4fb724e810bb496165b9015c77f402727392933/cmake/CMakeL…
-
### 🐛 Describe the bug
Dear all,
We seemly found a bug in nn.linear forwarding, here is a minimal example:
```python
# import
import torch
import time
# Set input size, output size, an…
-
### System Info
pytorch 2.2 and 2.4 are tested.
transformers 4.46.2
4 * A6000 ada
### Who can help?
@muellerzr
### Information
- [X] The official example scripts
- [ ] My own modified script…
-
**Describe the bug**
Can't create a FLOAT32 (as well as INT32) tensor with a shape that require tiling.
```
libc++abi: terminating due to uncaught exception of type std::runtime_error: TT_FATAL @ /ho…
-
## Context
Current ZeroPointDomain is bound to the layout
https://github.com/pytorch/ao/blob/2ba1a61fe1244560325b5051b5d3c10044553be0/torchao/quantization/quant_api.py#L607-L615
Ideally, we shoul…
-
both [fused-attention](https://triton-lang.org/main/getting-started/tutorials/06-fused-attention.html#sphx-glr-getting-started-tutorials-06-fused-attention-py) and [flash-attn-og](https://github.com/D…
-
Here is the warning:
```
AUTOTUNE flex_decoding(1x128x1x32x128, 1x128x8x128, 1x128x8x128, 1x1x128x32, 1x1x128x32, 1x1x1, 1x1x1x1, 1x1x1, 1x1x1x1)
triton_flex_decoding_1 0.0172 ms 100.0% BLOCKS_AR…
-
### System Info
2X L4 GPUs
Docker Image:
nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3
### Who can help?
@juney-nvidia @kaiyux
### Information
- [ ] The official example sc…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Environment
```markdown
- Milvus version: master-20241112-b5b00355-amd64
- Deployment mode(standalone …
-
**Describe the bug**
The code passes and gives correct output on Release mode but fails for Debug mode with the following error.
Error here. https://github.com/tenstorrent/tt-metal/blob/8901511f737c…