-
It is essential to keep up to date with OpenAI Triton, to get the latest features, and reduce the difficulty to upstream our changes to OpenAI Triton.
This ticket is continuation of:
- #1590
- #1…
-
### Motivation.
Recently, the OpenAI Triton backend for AMD hardware [PR 3643](https://github.com/vllm-project/vllm/pull/3643) was merged, which is so far the only flash attention backend with the…
-
Hi! I see that `openai/triton` requires a working toolchain at run-time, including a CUDAToolkit and libpython installations for the host platform. Currently, triton attempts to guess the correct comp…
-
### Bug Description
First I use
`
llama-index 0.9.13
`
and
`pip install llama-index-llms-nvidia-triton`(version==0.0.1 is installed and llama-index-core==0.9.56 installed)
But I cannot impo…
-
I have noticed that the README states Linux as the only compatible platform. https://github.com/openai/triton#compatibility
Some people in the past have managed to compile on Windows https://github…
-
In the ops/flash_attention.py, K, V blocks are accessed through `make_block_ptr`. For example, I have a question:
The input tensors `q, k, v` are of size (Batch, n_head, seq_num, dim_per_head), but…
-
### Required prerequisites
- [X] Make sure you've read the [documentation](https://pybind11.readthedocs.io). Your issue may be addressed there.
- [X] Search the [issue tracker](https://github.com/pyb…
-
The latest Triton refactoring removed the Intel Triton backend from the third-party, `llvm-target` branch is a fork of `openai/Triton` with in-tree modifications.
To upstream Intel XPU Triton backend…
-
It seems that `triton` has recently switched to `mathlib` in lieu of `libdevice` which causes following errors in `nn.triton_based_modules`:
```bash
AttributeError: module 'triton.language' has no a…
-
The handwritten CUDA operator is very complicated. How can we use openai triton in candle to simplify this process. :)