attention Search Results

1000+ results
for attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

xjdr-alt/entropix #62

attn entropy calculation should not look at future tokens

https://github.com/xjdr-alt/entropix/blob/eaaddb27f344c8c28922c7bfd0e01006645729ae/entropix/torch_sampler.py#L56-L58 This calculation computes entropy over the attention scores for each position in…

stillmatic updated 5 days ago
3
pytorch-labs/gpt-fast #211

Error with meta-llama/Llama-3.2-1B

When I use meta-llama/Llama-3.2-1B Can it be fixed? ``` RuntimeError: Error(s) in loading state_dict for Transformer: Missing key(s) in state_dict: "tok_embeddings.weight", "layers.0.attention.wq…

deafTim updated 2 weeks ago
2
pytorch/pytorch #139064

[Flex Attention] Cannot determine truth value of Relational

### 🐛 Describe the bug Flex attention with dynamic shapes stumbles upon comparing Relational expressions. I found two places of this error. One in `flex_decoding.py`: ``` File "/usr/local/li…

alexdremov updated 2 days ago
8
LWL-cpu/DEEIA #2

在运行bash ./scripts/train_wikievent_roberta.sh时出现了KeyError: 'i…

'roberta.encoder.layer.22.crossattention.self.abs_bias.1', 'roberta.encoder.layer.8.attention.self.abs_bias.0', 'roberta.encoder.layer.9.attention.self.abs_bias.1', 'roberta.encoder.layer.5.attention.…

daisir8 updated 4 days ago
1
pytorch/pytorch #137779

Flex attention with mask depending on queries and keys lengt…

### 🐛 Describe the bug I tried to implement the `causal_lower_right` masking in flex attention. This requires the masking function to know the difference in lengths of keys and queries: ```python …

janchorowski updated 4 days ago
3
vllm-project/vllm #6615

[Feature]: 4D Attention Mask

### 🚀 The feature, motivation and pitch I am workning on 4D attention mask input and LLM generateion process. Huggingface provides an interface for the 4D attention mask. Does vllm have any plan? htt…

littletomatodonkey updated 1 week ago
3
ultralytics/ultralytics #15996

attention layer in yolov9

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

Umsh updated 1 month ago
3
ROCm/flash-attention #79

[Issue]: is scaled_dot_product_attention part of flash atten…

### Problem Description I get these errors often from [various applications](https://github.com/pytorch/pytorch/issues/134208), this one if from ComfyUI. Is scaled_dot_product_attention part of fl…

unclemusclez updated 1 month ago
21
facebookresearch/xformers #1107

`memory_efficient_attention` is slower than `scaled_dot_prod…

# ❓ Questions and Help I am new to xformers, and I want to speed my Transformer models w/ it. But I found that `xformers` is no speed up compared w/ `scaled_dot_product_attention` from PyTorch. Here …

QinlongHuang updated 1 month ago
2
nod-ai/SHARK-Platform #405

Mismatches between config.json exported by export_paged_llm_…

The following edits were required to make llama3 8b fp16 work: ``` config["attn_head_count"] = 8 # 8 instead of 32 config["paged_kv_cache"] = {} config["paged_kv_cache"]["block_seq_stride"] = conf…

renxida updated 1 day ago
1

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for attention

1000+ results
for attention