issues
search
lucidrains
/
reformer-pytorch
Reformer, the efficient Transformer, in Pytorch
MIT License
2.1k
stars
254
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Visualization from Recorder
#106
nitinnairk
closed
4 years ago
2
Added XLA/TPU compatability
#105
hichiaty
closed
4 years ago
1
Fix backward multi round hashing
#104
patrickvonplaten
closed
4 years ago
3
Reformer Encoder Decoder Architecture
#103
AliOskooeiTR
closed
4 years ago
9
beam search generating more that just the top sentence
#102
anisdismail
closed
4 years ago
3
why do we have negative self-attention score?
#101
muiPomeranian
closed
4 years ago
31
NaN value(not too often) from self attention of Q and K
#100
muiPomeranian
closed
4 years ago
1
is it safe to assume this repo's setup same as original paper?
#99
muiPomeranian
closed
4 years ago
1
Experiment with Ranger, the state of the meta optimizer
#98
LifeIsStrange
closed
4 years ago
3
Reformer pre-training gradient accumulation
#97
apoorv2904
closed
4 years ago
2
pretraining on my text data
#96
doppler21
closed
4 years ago
0
Applications for reformer model
#95
lucky-23
closed
4 years ago
4
Question: Chunking input is expected behavior in the standard reformer?
#94
hotohoto
closed
4 years ago
1
why do we dot product with look back?
#93
muiPomeranian
closed
4 years ago
5
enwik8_simple example, what is it doing?
#92
felipeboffnunes
closed
4 years ago
1
Why mutiply seq_len(4096)*buckets + ticker%4096 before sorting?
#91
muiPomeranian
closed
4 years ago
3
Possible bug in end-dec attention?
#90
py4
closed
4 years ago
18
Un-deprecated
#89
david-macleod
closed
4 years ago
2
Error with shapes in LocalAttention
#88
ilya16
closed
4 years ago
5
Documentation for Masks
#87
calebh
closed
4 years ago
5
add rezero
#86
lucidrains
closed
4 years ago
0
add local attention
#85
lucidrains
closed
4 years ago
0
Error when training mlm
#84
singhay
closed
4 years ago
0
Training too slow, parameter check!
#83
singhay
closed
4 years ago
4
Eval loss not consistent in multiple iterations
#82
singhay
closed
4 years ago
1
add caching for buckets for reversible net
#81
lucidrains
closed
4 years ago
0
Question on enc_input_mask and ignore_index / pad_value for EncDec.
#80
nakarinh14
closed
4 years ago
6
a typo regarding kwargs
#79
xuanqing94
closed
4 years ago
3
Update README.md
#78
Kyeongpil
closed
4 years ago
2
A question about allow_duplicate_attention
#77
L-Hugh
closed
4 years ago
2
RuntimeError: tabulate: failed to synchronize: cudaErrorAssert: device-side assert triggered
#76
xijiz
closed
4 years ago
2
performance
#75
zhao1402072392
closed
4 years ago
4
Reset input_mask during Generative.
#74
liuqiangict
closed
4 years ago
13
Which function executes the multi-attention part?
#73
arushi-08
closed
4 years ago
4
Reformer for Translation task. Assertion Error: Sequence length needs to be divisible by bucket size*2
#72
arushi-08
closed
4 years ago
3
Pretrained models?
#71
Hellisotherpeople
closed
4 years ago
1
Results
#70
gentaiscool
closed
4 years ago
1
DeepSpeed and Generate Method
#69
CalogeroZarbo
closed
4 years ago
14
Experiment
#68
betwinam
closed
4 years ago
1
`return_embedding` seems to be a no-op
#67
erip
closed
4 years ago
3
Is enc_input_mask equal to set pad_idx and ignore_idx?
#66
CalogeroZarbo
closed
4 years ago
2
Image Generation Example
#65
liuqiangict
closed
4 years ago
1
Predicting with Encoder-Decoder structure
#64
nakarinh14
closed
4 years ago
4
MISH as ActivationFunction
#63
CalogeroZarbo
closed
4 years ago
3
BERT-like masked language pre-training?
#62
JunhaoWang
closed
4 years ago
1
Typo in LSHSelfAttention causing NameError exception
#61
lduperier
closed
4 years ago
1
DeepSpeed and nn.Embedding issue
#60
CalogeroZarbo
closed
4 years ago
7
A full Reformer image → caption example was wrong
#59
TrungThanhTran
closed
4 years ago
4
make sure bucket for local attention is unique
#58
lucidrains
closed
4 years ago
0
add local attention hash, recommended by @AranKomat
#57
lucidrains
closed
4 years ago
0
Previous
Next