attention Search Results

1000+ results
for attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

apple/ml-hypercloning #1

this does not work properly for GQA case (llama or gemma)

when expanding the number of query heads for GQA architecture, it needs to clone vectors interleaving way not just repeating.

bzantium updated 1 day ago
2
intel/intel-xpu-backend-for-triton #2158

[Advanced(block-ptr) Path] add print for block pointer

when we enable flash attention, hard to debug if the result can not match. so we'd like to add print for easy debug process.

Dewei-Wang-sh updated 2 weeks ago
2
facebookresearch/xformers #750

FlashAttention gives error with attention mask / attention b…

# 🐛 Bug If i use MemoryEfficientAttentionFlashAttentionOp as my attention op for memory efficient attention, and use attention bias, it will give me errors :( ## Command ``` import math impor…

dguo98 updated 1 year ago
1
helq2612/BiADT #3

About Deformable Attention

Hi, I would like to ask about the Deformable Attention mechanism in the paper. I went to the paper DEFORMABLE DETR: DEFORMABLE TRANSFORMERS FOR END-TO-END OBJECT DETECTION and the Deformable Atten…

Tsai-chia-hsiang updated 10 months ago
8
huggingface/transformers #33358

Can't save quantized models

### System Info - `transformers` version: 4.44.2 - Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.24.6 - Safetensors version: 0.…

lamashnikov updated 4 days ago
11
vllm-project/vllm #6307

[Feature]: Is there any plan to support Cross-Layer Attentio…

The Cross-Layer Attention (CLA) proposed by MIT recently can significantly reduce runtime memory usage. Does vLLM have any plans to support it? Thanks! Cross-Layer Attention paper: https://arxiv.or…

JiayiFeng updated 1 month ago
4
vllm-project/vllm #6720

[Bug]: flash_attn # prefix-enabled attention case forward c…

### Your current environment code review ### 🐛 Describe the bug flash_attn.py forward func code: else: # prefix-enabled attention assert prefill_m…

yangchengtest updated 1 week ago
1
pinokiocomputer/pinokio #220

Videocrafter 2 1Torch error

When im trying to use Videocrafter 2 - i get this error : F:\Pinokio\api\videocrafter2.git\app\env\lib\site-packages\torch\nn\functional.py:5560: UserWarning: 1Torch was not compiled with flash att…

fremalm updated 1 month ago
1
huggingface/diffusers #7895

Flax's use_memory_efficient_attention is broken

### Describe the bug I accidentally introduced a bug in this [PR](https://github.com/huggingface/diffusers/pull/5181) by making a condition on [this line](https://github.com/huggingface/diffusers/blo…

entrpn updated 1 month ago
1
QwenLM/Qwen2-VL #336

code of visual embedding adapter?

Dear Author I am trying to locate the section of the code that handles the cross-attention layer between text embedding and visual embedding. Could you please guide me to the relevant part of the c…

MartinYYYYan updated 1 week ago
1

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for attention

1000+ results
for attention