issues
search
lucidrains
/
x-transformers
A simple but complete full-attention transformer with a set of promising experimental features from various papers
MIT License
4.37k
stars
370
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is this the same "X-transformer" that being used in "X-Transformer: A Machine Translation Model Enhanced by the Self-Attention Mechanism" paper?
#257
argadewanata
closed
1 week ago
1
Random lack of gradients
#256
Baran-phys
closed
1 month ago
1
Problem with cache and memory
#255
Baran-phys
closed
1 month ago
0
Enable flash attention does not support BFloat16?
#254
Kaimary
closed
1 week ago
1
How to use "src_key_padding_mask"
#253
LutherLin
closed
1 month ago
2
Sinusoidal embedding order choice different from original definition
#252
gordicaleksa
closed
1 month ago
1
migrate to less confusing way of doing rotary
#251
lucidrains
closed
1 month ago
1
RoPE inconsistency (2-dim subspaces choice)
#250
gordicaleksa
closed
1 month ago
0
Correct interaction between CLS token and RoPE
#249
oasidorshin
closed
2 months ago
5
Question: problem with xval implementation
#248
HarshaSatyavardhan
closed
2 months ago
5
[Bug] Error when `rotary_pos_emb` set to True in cross attention
#247
BakerBunker
closed
2 months ago
3
Was it a clerical error ? ScaleNorm.g init form dim ** -0.5. I think it should be dim ** 0.5
#246
junphine
closed
2 months ago
1
[Question] very small attention scores
#245
pfeatherstone
closed
1 month ago
7
Pass custom scale to flash attention
#244
Subuday
closed
4 months ago
5
ContinuousTransformerWrapper: turning on absolute positional embedding: mirror TransformerWrapper
#243
pfeatherstone
closed
4 months ago
2
[Bug] XL-recurrence with AlibiPositionalBias and mems not working correctly
#242
pfeatherstone
closed
2 months ago
17
Question: rotary embeddings and bad length extrapolation
#241
pfeatherstone
closed
1 month ago
1
How can I add custom attention masks to a Decoder?
#240
DerEchteFeuerpfeil
closed
4 months ago
3
Confusion about image->caption example
#239
mtran14
opened
5 months ago
1
Generation for PaLI?
#238
BurgerAndreas
opened
5 months ago
0
`layer_mem` is unbound (when called from `ContinuousTransformerWrapper`)
#237
amitkparekh
closed
5 months ago
6
Multi Input/output transformers
#236
RyanKim17920
opened
5 months ago
5
Multi Input/Output transformers
#235
RyanKim17920
closed
5 months ago
1
Fix xpos when using mems
#234
pfeatherstone
closed
5 months ago
3
RotaryEmbedding XPOS doesn't work with mems
#233
pfeatherstone
closed
2 days ago
5
[Minor; noob question] Uniform distribution instead of normal
#232
p0p4k
opened
5 months ago
0
Update x_transformers.py
#231
notprime
closed
5 months ago
9
How to build optimizer
#230
pfeatherstone
closed
5 months ago
9
Question: How to implement rel_pos_bias in cross_attention?
#229
alexdemartos
closed
2 months ago
13
attn_num_mem_kv > 0 and attn_one_kv_head = True error
#228
pfeatherstone
closed
5 months ago
8
Adding memmask to ContinuousTransformerWrapper
#227
pfeatherstone
closed
5 months ago
3
Seq len missing in rotary embedding
#226
raganato
closed
6 months ago
3
Removed biases breaks pre-trained models
#225
zqevans
closed
5 months ago
5
Fix rotary embeddings when mems != None
#224
pfeatherstone
closed
6 months ago
11
XL-recurrence with RotaryEmbedding and mems not working correctly.
#223
pfeatherstone
closed
6 months ago
34
Enhancement: Multi Input/Output transformers
#222
RyanKim17920
opened
6 months ago
1
Question: How to load model trained on earlier version of x-transformers
#221
tmphex
closed
6 months ago
3
Init bias=0 in to_logits
#220
ad8e
closed
6 months ago
13
kv cache breaks generation
#219
ad8e
closed
6 months ago
5
how to set inputs to the right shape
#218
emadkavousi
opened
6 months ago
1
"Stabilizing Transformer Training by Preventing Attention Entropy Collapse" improvement to ViT
#217
catid
closed
6 months ago
1
Question: num_memory_tokens > 0 and return_mems = True
#216
pfeatherstone
closed
6 months ago
3
Support for NormSoftmax
#215
catid
closed
7 months ago
16
Simplifying Transformer Blocks (https://arxiv.org/abs/2311.01906)
#214
Froskekongen
closed
7 months ago
9
Bert token type embedding
#213
eyalmazuz
closed
7 months ago
2
ONNX export failed
#212
pfeatherstone
opened
7 months ago
14
Masking for prepend_embeds
#211
zqevans
closed
7 months ago
7
rotary embedding issues when training in mixed precision
#210
zqevans
closed
7 months ago
2
[Bug] ContinuousTransformerWrapper - return_mems doens't work
#209
pfeatherstone
closed
7 months ago
1
Question: masking in token shifting
#208
pfeatherstone
opened
7 months ago
1
Next