local-attention Search Results

1000+ results
for local-attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #601

Error message when using ORPO fine-tuning

When using ORPO to fine-tune mistral-7b-instruct-v0.3-bnb-4bit, after clicking orpo_trainer.train() to start, the following error message appears: `-------------------------------------------------…

MRQJsfhf updated 4 months ago
1
NVlabs/EAGLE #20

NotImplementedError: Cannot copy out of meta tensor; no data…

Hi, I am getting this issue. I am running it on following system. I followed the instructions given in README. Windows 11 Home Intel Core i9 32GB RAM ( I tried with Anaconda and python3.11.9 and…

clock-workorange updated 2 days ago
2
withastro/astro #12262

Chinese routing display error in 5.0.0-beta.5 version

### Astro Info ```block Astro v5.0.0-beta.5 Node v20.15.0 System Windows (x64) Package Manager npm Output st…

JinMokai updated 3 days ago
3
vllm-project/vllm #3385

Attention sliding window

In Hugging Face "eager" Mistral implementation, a sliding window of size 2048 will mask 2049 tokens. This is also true for flash attention. In the current vLLM implementation a window of 2048 will mas…

caiom updated 3 weeks ago
9
TMElyralab/MusePose #81

Stage2 RuntimeError: The size of tensor a (22) must match th…

~/MusePose# accelerate launch train_stage_2.py --config configs/train/stage2.yaml The following values were not passed to `accelerate launch` and had defaults used instead: `--num_processes`…

FangSen9000 updated 1 week ago
1
vllm-project/vllm #8230

[Bug]: vLLM 0.5.5 using prefix caching causing CUDA error: i…

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` ### 🐛 Describe the bug --enable-prefix-caching causing …

Sekri0 updated 1 month ago
7
vllm-project/vllm #7519

[Feature]: Context Parallelism

### 🚀 The feature, motivation and pitch As we can see, Google Gemini can support up to million tokens and to serve longer context length, we have to do context parallelism, which means, split the i…

huseinzol05 updated 2 weeks ago
8
lutris/lutris #5509

Games can no longer be launched from steam deck gaming mode

### Bug description When launching a game from the steam deck gaming mode, installed using the flatpak version of Lutris 0.5.17, it immediately crashes/fails to launch. Launching the same game from…

Nevon updated 2 months ago
15
gophercloud/gophercloud #3200

Supporting Neutron security address groups

Currently the option to create a security group rule only allows RemoteGroupID or RemoteIPPrefix on rule creation, could this be extended to allow the use of remote address groups per the API? htt…

Jscott377 updated 1 week ago
3
Genymobile/scrcpy #5274

Can't turn screen off when mirroring while using Tecno Pova …

- OS: [LMDE (Linux Mint Debian Version) 6] - Scrcpy version: 2.6.1 - Installation method: cloning from repo using terminal - Device model: Tecno LH7n - Android version: 14 When I tried to …

zim267 updated 1 month ago
1

上一页 1...10 11 12 13 14 15 16...100 下一页

1000+ results for local-attention

1000+ results
for local-attention