attention-mechanism Search Results

1000+ results
for attention-mechanism

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

lucidrains/stylegan2-pytorch #114

Using attention layers

I tried different values (`--attn-layers [1,2,3] `) for attention mechanism, but the results are either the same or worse. Did anyone find a way to improve FID/IS scores using attention?

AlexTS1980 updated 3 years ago
3
rild/TIL #33

Local Monotonic Attention Mechanism for End-to-End Speech an…

[arxiv](https://arxiv.org/abs/1705.08091) [keithito/tacotron/issues/#72](https://github.com/keithito/tacotron/issues/72)で勧められた Tacotron の学習に、複数話者の音声データを使用した際に Enc/Doc の Alignment がうまくいかない問題が起…

rild updated 6 years ago
2
PKU-YuanGroup/Open-Sora-Plan #198

Longer Video Generation

Hi @LinB203 , just want to bring [VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis](https://arxiv.org/pdf/2403.13501.pdf) to your attention, where the temporal attention mechanism…

YumengLi007 updated 6 months ago
1
Ivsxk/RAT #5

Attention mask unused?

``` ass_mask=torch.ones(q_size2*q_size1,1,1,q_size0).cuda() #[31*128,1,1,11] x, self.attn_asset = attention(ass_query, ass_key, ass_value, mask=None, …

codeninja updated 3 years ago
1
dmlc/gluon-nlp #951

Add raw attention scores to the AttentionCell

## Description For implementing a pointer mechanism in sequence to sequence models it is very practical to re-use attention cells. For example see the Attention-Based Copy Mechanism described in Jia,…

emilmont updated 5 years ago
3
zhaoqichang/AttentionDTA_BIBM #3

Hi, I got a problem for the attention mechanism of your Atte…

Hi, your developed AttentionDTA is very useful for the Interpretability of DTA prediction task! but I didn't find the corresponding Attention module from your published project (eg. **_how to calcul…

hengwei-chan updated 3 years ago
1
databricks/megablocks #107

1-expert worse than dense model

I'm finding that training a 1-expert dMoE (brown) has worse training loss than an otherwise equivalent dense model (green). Is there some reason why this difference is expected or can I expect them to…

Muennighoff updated 1 month ago
1
EleutherAI/lm-evaluation-harness #2335

Dynamical prompt with extremely promising results #RIPrompt

This is a little bit of a plug, so I'll keep it short! I'm trying to nail down _**exactly** what's going on here_. https://riprompt.com https://riprompt.com/riprompt.txt https://chatgpt.com/g/g-9…

anthonyrisinger updated 3 weeks ago
1
ReproBrainChart/HBN_CPAC #2

Ideally this dataset should be made more "modular" and light…

Initially brought up attention to this by https://neurostars.org/t/cloning-hbn-cpac-data/30587/4 . I cloned this dataset locally to investigate and found that it consumes 1.3G of `.git/objects` and…

yarikoptic updated 1 month ago
2
state-spaces/mamba #180

Question about does mamba support variable-length input or c…

We know that flash attention supports `cu_seqlens`, which can remove padding for variable-length input in a batch and only store regular tokens. This can be useful for optimizing the computational eff…

zigzagcai updated 7 months ago
12

上一页 1...12 13 14 15 16 17 18...100 下一页

1000+ results for attention-mechanism

1000+ results
for attention-mechanism