-
### RuntimeError: The expanded size of the tensor (628) must match the existing size (129) at non-singleton dimension 3. Target sizes: [1, 32, 1, 628]. Tensor sizes: [1, 1, 1, 129]
Issue: Upon ru…
-
Tried to extracting LORA from gemma-2-it
```
mergekit-extract-lora models/gemma-2-2b-it/ models/gemma-2-2b models/lora/gemma-2-it-lora
The installed version of bitsandbytes was compiled without GPU s…
-
Hi,
I am trying to set up vLLM Mixtral 8x7b on GCP. I have a VM with two A100 80GBs, and am using the following setup:
docker image: vllm/vllm-openai:v0.3.0
Model: mistralai/Mixtral-8x7B-Instruct…
-
### 🐛 Describe the bug
Is symbolic execution expected to fail for kernels that have fallthrough registrations?
Reproducer with a fallthrough for the mul operator:
```python
import torch
impo…
-
#### Summary:
From @betanalpha
C_{ij} = sum{m, n = 1}^{N} A_{imn} B_{jmn}
#### Description:
In Stan code:
```
matrix tensor_product(matrix[] A, matrix[] B) {
matrix[size(A), size(B)…
-
**Describe the bug**
A clear and concise description of what the bug is.
The `log` function returns an invalid value.
**To Reproduce**
Steps to reproduce the behavior:
1. Copy and past below …
-
For some reason, `torch.argsort` is crashing when the tensor is on a Neuron device. For example, the code snippet bellow works as expected (on CPU):
```
>>> import torch
>>> import torch_neuronx
>…
-
Naively, observable sharing means that we either need to put sharing in the hands of users (by giving them the choice to share or not), or give them a verbose API that doesn't really resemble maths. W…
-
### 🐛 Describe the bug
If I torch.compile `torch.amp.GradScaler`, it works. But if I copy paste grad_scaler.py and import GradScaler from there, I receive an error.
To reproduce (testcase taken …
-
### 🐛 Describe the bug
I've found that the output of the wav2vec2 pipeline model is bugged, and changes depending on the zero-padding used in batch preprocessing, a simple example Is as follows:
…