-
### Feature request
Hi, I'm the author of [zhuzilin/ring-flash-attention](https://github.com/zhuzilin/ring-flash-attention).
I wonder if you are interested in integrating context parallel with [zh…
-
### 🐛 Describe the bug
I struggled a bit to get a repro, but I think this is in the realm of reasonable and identifies the behavior that causes my runs to diverge.
```python
import torch
impor…
-
GPU: NVIDIA H20
音频长度: 1分30秒
音频格式: wav
图片格式: png
图片大小: 208K, 525x526, 25 fps, 25 tbr, 25 tbn
分支: main
配置:
## configs/prompts/animation_acc.yaml
## dependency models
pretrained_base_mod…
-
### System Info
- `transformers` version: 4.44.2
- Platform: macOS-15.1-arm64-arm-64bit
- Python version: 3.10.14
- Huggingface_hub version: 0.23.3
- Safetensors version: 0.4.3
- Accelerate vers…
-
### System Info
transformers.version=4.42.4
### Who can help?
@Gante
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### Tasks
- [ ] An officially supported tas…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
- `llamafactory` version: 0.9.1.dev0
- Platform: Linux-5.4.119-1-tlinux4-0010.3-x86_64-with-glibc2.38
-…
-
Here is the development roadmap for 2024 Q4. Contributions and feedback are welcome ([**Join Bi-weekly Development Meeting**](https://t.co/4BFjCLnVHq)). Previous 2024 Q3 roadmap can be found in #634.
…
-
When opening the URL (http://0.0.0.0:7860) I get the "can't reach this page" message. I don't get any errors while loading, apart from the "No module named 'triton'" one, which I assume is normal on …
-
**Describe the bug**
When attempting to shard a `gemma_2b_en` model across two (consumer-grade) GPUs, I get:
```
ValueError: One of device_put args was given the sharding of NamedSharding(mesh=…
-
### 🚀 The feature, motivation and pitch
Linear attention allows for longer context and faster inference. The eagle model has a 2T checkpoint soon.
### Alternatives
NA
### Additional context
_No r…