-
I am requesting that you merge with the upstream flash-attention repo, in order to garner community engagement and improving integration and distribution.
This separation is a major blocker to AMD …
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.1.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### 🐛 Describe the bug
There is an additional dimension appearing in the second return value of a `torch.nn.LSTM` layer when we apply `torch.compile` (in all backends) to it. The additional dimension…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
### Your current environment
Issue with Pixtral Model: Unsupported Vision Configuration in vLLM (AMD Radeon 7900 XTX)
I am trying to load the Pixtral model from Hugging Face (specifically, mistr…
-
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC ve…
-
-
Just looking to confirm that the markdown files still have up-to-date instructions for us new folks. It looks like the last update was 6 months ago.
Mainly this file:
`mlx-examples/llms/CONTRIBUTI…
-
When executing script `examples/offline_inference_with_prefix.py`, it will call `context_attention_fwd` from `vllm.model_executor.layers.triton_kernel.prefix_prefill`, which triggered the following er…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N…