linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #33467

Support context parallel training with ring-flash-attention

### Feature request Hi, I'm the author of [zhuzilin/ring-flash-attention](https://github.com/zhuzilin/ring-flash-attention). I wonder if you are interested in integrating context parallel with [zh…

zhuzilin updated 1 day ago
5
pytorch/pytorch #135161

Significant Accuracy Difference between Compiled and Eager F…

### 🐛 Describe the bug I struggled a bit to get a repro, but I think this is in the realm of reasonable and identifies the behavior that causes my runs to diverge. ```python import torch impor…

cora-codes updated 2 days ago
13
BadToBest/EchoMimic #159

超过一分钟的长视频合成后，视频后半部分黑屏

GPU: NVIDIA H20 音频长度: 1分30秒音频格式: wav 图片格式: png 图片大小: 208K, 525x526, 25 fps, 25 tbr, 25 tbn 分支: main 配置: ## configs/prompts/animation_acc.yaml ## dependency models pretrained_base_mod…

TonyEiven updated 3 weeks ago
1
huggingface/transformers #33680

save_pretrained is changing the name of module when saving

### System Info - `transformers` version: 4.44.2 - Platform: macOS-15.1-arm64-arm-64bit - Python version: 3.10.14 - Huggingface_hub version: 0.23.3 - Safetensors version: 0.4.3 - Accelerate vers…

ZhiyuanChen updated 1 week ago
4
huggingface/transformers #33411

batch inference scales linearly with batch size when input i…

### System Info transformers.version=4.42.4 ### Who can help? @Gante ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks - [ ] An officially supported tas…

platypus1989 updated 1 month ago
1
hiyouga/LLaMA-Factory #5539

lora微调qwen2.5-math-7b出问题

### Reminder - [X] I have read the README and searched the existing issues. ### System Info - `llamafactory` version: 0.9.1.dev0 - Platform: Linux-5.4.119-1-tlinux4-0010.3-x86_64-with-glibc2.38 -…

lin-dy updated 1 week ago
2
sgl-project/sglang #1487

Development Roadmap (2024 Q4)

Here is the development roadmap for 2024 Q4. Contributions and feedback are welcome ([**Join Bi-weekly Development Meeting**](https://t.co/4BFjCLnVHq)). Previous 2024 Q3 roadmap can be found in #634. …

Ying1123 updated 4 days ago
1
zjp-shadow/CharacterGen #17

The webui doesn't load

When opening the URL (http://0.0.0.0:7860) I get the "can't reach this page" message. I don't get any errors while loading, apart from the "No module named 'triton'" one, which I assume is normal on …

EightiesPower updated 2 months ago
1
keras-team/keras-hub #1613

GemmaBackbone.get_layout_map broken for gemma_2b_en

**Describe the bug** When attempting to shard a `gemma_2b_en` model across two (consumer-grade) GPUs, I get: ``` ValueError: One of device_put args was given the sharding of NamedSharding(mesh=…

josharian updated 3 weeks ago
5
vllm-project/vllm #3583

Supporting RWKV models

### 🚀 The feature, motivation and pitch Linear attention allows for longer context and faster inference. The eagle model has a 2T checkpoint soon. ### Alternatives NA ### Additional context _No r…

RonanKMcGovern updated 2 weeks ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model