attention Search Results

1000+ results
for attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #138514

torch.compile(create_block_mask) errors in certain cases and…

### 🐛 Describe the bug ``` import os, sys import torch from functools import lru_cache, partial from torch.nn.attention.flex_attention import ( _DEFAULT_SPARSE_BLOCK_SIZE, create_bl…

Chillee updated 2 weeks ago
6
Dao-AILab/flash-attention #839

Is flash attention support ring self-attention?

gaodaheng updated 6 months ago
1
geoffrey-anto/drones-attention-based-lstm-deep-q-network-rpp #3

ImportError: cannot import name '_saturate_cast' from 'keras…

D:\anaconda3\envs\drones\python.exe D:\PycharmProj\drones-attention-based-lstm-deep-q-network-rpp-main\main.py 2024-10-30 19:26:50.543598: I tensorflow/core/util/port.cc:153] oneDNN custom operation…

kongbinGH updated 2 weeks ago
3
vectordotdev/vector #21735

Audit existing components and document missing metrics

From this discussion https://github.com/vectordotdev/vector/discussions/21727#discussioncomment-11181503 it came to our attention that there are various undocumented `file` sink metrics. Ideally these…

pront updated 1 day ago
1
zhengchuanpan/GMAN #37

group spatial attention

论文提到采用了把节点分组的方式，理乱上减少了计算的复杂度，请问在代码中计算空间注意力这一块儿，哪里体现了分组计算呢？

lympassion updated 4 months ago
3
pytorch/pytorch #137901

Any plan to support flash attention 3 for hopper GPUs?

### 🚀 The feature, motivation and pitch Flash Attention 3 (https://github.com/Dao-AILab/flash-attention) has been in beta for some time. I tested it on H100 GPUs with CUDA 12.3 and also attempted a…

fno2010 updated 3 weeks ago
5
tenstorrent/tt-metal #12444

Remove attention mask from models using SDPA

With https://github.com/tenstorrent/tt-metal/pull/12309, causal SDPA no longer accepts an attention mask. It instead generates its own causal mask. The PR only removed the attention mask from calls to…

cglagovichTT updated 1 month ago
1
zjp-shadow/CharacterGen #17

The webui doesn't load

When opening the URL (http://0.0.0.0:7860) I get the "can't reach this page" message. I don't get any errors while loading, apart from the "No module named 'triton'" one, which I assume is normal on …

EightiesPower updated 1 week ago
2
myshell-ai/MeloTTS #211

Training vs tensorboard metrics

Will my training yield better results over time? Currently, the training took about 9 hours. I have 1500 wav samples, with a total audio length of approximately 2 hours. ![Screenshot 2024-11-08 at…

smlkdev updated 1 day ago
8
pytorch-labs/attention-gym #56

Optimal ordering with block mask

From my understanding, flex attention (using `block_mask`) gets faster when the number of empty blocks is larger. If the inputs (Q, K, V) do not represent sequences, but graphs with local connectivity…

francois-rozet updated 2 days ago
9

上一页 1...38 39 40 41 42 43 44...100 下一页

1000+ results for attention

1000+ results
for attention