lucidrains reformer-pytorch issues

lucidrains / reformer-pytorch

Reformer, the efficient Transformer, in Pytorch

MIT License

2.1k stars 254 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Visualization from Recorder

#106 nitinnairk closed 4 years ago
2
Added XLA/TPU compatability

#105 hichiaty closed 4 years ago
1
Fix backward multi round hashing

#104 patrickvonplaten closed 4 years ago
3
Reformer Encoder Decoder Architecture

#103 AliOskooeiTR closed 4 years ago
9
beam search generating more that just the top sentence

#102 anisdismail closed 4 years ago
3
why do we have negative self-attention score?

#101 muiPomeranian closed 4 years ago
31
NaN value(not too often) from self attention of Q and K

#100 muiPomeranian closed 4 years ago
1
is it safe to assume this repo's setup same as original paper?

#99 muiPomeranian closed 4 years ago
1
Experiment with Ranger, the state of the meta optimizer

#98 LifeIsStrange closed 4 years ago
3
Reformer pre-training gradient accumulation

#97 apoorv2904 closed 4 years ago
2
pretraining on my text data

#96 doppler21 closed 4 years ago
0
Applications for reformer model

#95 lucky-23 closed 4 years ago
4
Question: Chunking input is expected behavior in the standard reformer?

#94 hotohoto closed 4 years ago
1
why do we dot product with look back?

#93 muiPomeranian closed 4 years ago
5
enwik8_simple example, what is it doing?

#92 felipeboffnunes closed 4 years ago
1
Why mutiply seq_len(4096)*buckets + ticker%4096 before sorting?

#91 muiPomeranian closed 4 years ago
3
Possible bug in end-dec attention?

#90 py4 closed 4 years ago
18
Un-deprecated

#89 david-macleod closed 4 years ago
2
Error with shapes in LocalAttention

#88 ilya16 closed 4 years ago
5
Documentation for Masks

#87 calebh closed 4 years ago
5
add rezero

#86 lucidrains closed 4 years ago
0
add local attention

#85 lucidrains closed 4 years ago
0
Error when training mlm

#84 singhay closed 4 years ago
0
Training too slow, parameter check!

#83 singhay closed 4 years ago
4
Eval loss not consistent in multiple iterations

#82 singhay closed 4 years ago
1
add caching for buckets for reversible net

#81 lucidrains closed 4 years ago
0
Question on enc_input_mask and ignore_index / pad_value for EncDec.

#80 nakarinh14 closed 4 years ago
6
a typo regarding kwargs

#79 xuanqing94 closed 4 years ago
3
Update README.md

#78 Kyeongpil closed 4 years ago
2
A question about allow_duplicate_attention

#77 L-Hugh closed 4 years ago
2
RuntimeError: tabulate: failed to synchronize: cudaErrorAssert: device-side assert triggered

#76 xijiz closed 4 years ago
2
performance

#75 zhao1402072392 closed 4 years ago
4
Reset input_mask during Generative.

#74 liuqiangict closed 4 years ago
13
Which function executes the multi-attention part?

#73 arushi-08 closed 4 years ago
4
Reformer for Translation task. Assertion Error: Sequence length needs to be divisible by bucket size*2

#72 arushi-08 closed 4 years ago
3
Pretrained models?

#71 Hellisotherpeople closed 4 years ago
1
Results

#70 gentaiscool closed 4 years ago
1
DeepSpeed and Generate Method

#69 CalogeroZarbo closed 4 years ago
14
Experiment

#68 betwinam closed 4 years ago
1
`return_embedding` seems to be a no-op

#67 erip closed 4 years ago
3
Is enc_input_mask equal to set pad_idx and ignore_idx?

#66 CalogeroZarbo closed 4 years ago
2
Image Generation Example

#65 liuqiangict closed 4 years ago
1
Predicting with Encoder-Decoder structure

#64 nakarinh14 closed 4 years ago
4
MISH as ActivationFunction

#63 CalogeroZarbo closed 4 years ago
3
BERT-like masked language pre-training?

#62 JunhaoWang closed 4 years ago
1
Typo in LSHSelfAttention causing NameError exception

#61 lduperier closed 4 years ago
1
DeepSpeed and nn.Embedding issue

#60 CalogeroZarbo closed 4 years ago
7
A full Reformer image → caption example was wrong

#59 TrungThanhTran closed 4 years ago
4
make sure bucket for local attention is unique

#58 lucidrains closed 4 years ago
0
add local attention hash, recommended by @AranKomat

#57 lucidrains closed 4 years ago
0

Previous Next