-
Is there any way the Flash Attention 2 support for this model? if there is a way to do it i would love to get involved and help out!
I've tried implement by looking at [MusicGen's one ](https://git…
-
I try to load loras with
pipeline.pipe.load_lora_weights("/kaggle/input/lorass/acuarelac1400.safetensors")
I don't know if it is the correct way, it would be helpful if you told me how to load lor…
-
**Describe the bug**
When I use flash attn=2.0.4, running Nemo will result in an error `NameError: name' flash_attn_with_kvcache 'is not defined`
After checking the [code,](https://github.com/NVIDIA…
-
When I read the code in your nice_stand.py file, I didn't see you using self-attention or graph attention mechanisms, but you describe this part in your paper
![图片1](https://github.com/eeyhsong/NICE-…
-
We should redesign then navbar_alerts banners (`web/templates/navbar_alerts`).
Designs [in Figma]( https://www.figma.com/design/msWyAJ8cnMHgOMPxi7BUvA/Zulip-Web-UI-kit?node-id=563-2713&t=ZDGbub…
alya updated
22 hours ago
-
This is my simple test script:
```python
import torch
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf
torch_device = "m…
-
The 2 AMD GPU cards should be at the NERC attention @hakasapl .
Please arrange for them to be installed - techsquare? And available under ESI.
Price to charge to be addressed in https://github.com…
-
# 🚀 Feature
Support Flash Attention 3
## Motivation
Flash Attention 3 has been proved to greatly accelerate Flash Attention 2 on H100.
## Pitch
Offer Flash Attention 3 support
-
Are there plans to add flash attention and also flash decoding to allow for improved performance for long context?
-
### 🚀 The feature, motivation and pitch
1. NotImplementedError: Could not run 'aten::_to_copy' with arguments from the 'NestedTensorXPU' backend
cases:
test_transformers.py::TestTransformersXPU::te…