-
## 🚀 Model / language coverage
I'm trying to get a fuller picture of what we need to support NeVA. As such I'm using:
```python
def thunder_backend(gm, args):
gm.real_recompile()
from thu…
-
Hello team,
Please i need help to solve this issue, the test is failing:
python lm_inference_test.py --meliad_path=$MELIAD_PATH --data_path=$DATA
I0130 03:37:30.642391 139830854076224 nn_comp…
-
```python
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(200, 9126, 1, 64) (torch.float32)
key : shape=(200, 912…
-
The softmax in tt-lib will fail if the tensor is in tile layout with a shape of 8,2,2 padded to 8,32,32 using a padded value of zero instead of -inf.
We have a workaround, at the moment we fallback…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Describe the bug
bfloat16 and float16 vector does not support panda Dataframe data type with not user fr…
-
### 🐛 Describe the bug
Repro: https://github.com/pytorch-labs/ao/pull/93
currently the cpu time for running weight only int4 quantization seems to be slow in x86, looks the same as unlowered cpu m…
-
### Reference code
- Llama-recipes code
[https://github.com/meta-llama/llama-recipes/tree/b7fd81c71239c67345d897c0eb6529eba076e8b8](https://github.com/meta-llama/llama-recipes/tree/b7fd81c71239c…
-
Hi. Raising this issue as I am experimenting a much slower inference time with Gemma-1 models.
> Environment:
> - xformers 0.0.26.post1 pypi_0 pypi
> - unsloth …
-
Hi, I am very happy to find a repo that can be used to fine-tune blip2 quickly
While using it (Llava instruction data), I ran into some issues.
I only have V100, but the model appears the following …
-
Hi,
thx for the hard work!
Would it be possible to unload the model from VRAM after a certain time?
For testing and VRAM contraints, when using multiple services, that would be really helpfull.
…