issues
search
idiap
/
fast-transformers
Pytorch library for fast transformer implementations
1.65k
stars
179
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
TypeError: canonicalize_version() got an unexpected keyword argument 'strip_trailing_zero'
#132
luispintoc
opened
2 months ago
0
Full Attention does not sum to 1
#131
yourj4m
closed
4 months ago
1
Speed of linear attention slower than the attention implemented in pytorch
#130
yzeng58
opened
5 months ago
0
[WinError 2] The system cannot find the file specified: build_ext
#129
cliffordkleinsr
opened
8 months ago
1
ERROR: Could not build wheels for pytorch-fast-transformers, which is required to install pyproject.toml-based projects
#128
ouusan
opened
11 months ago
12
ImportError
#127
PaulaTeeuwen
opened
1 year ago
1
Provenance of algorithms
#126
taibai123abc
opened
1 year ago
0
`.causal_product_cuda` missing in pip installed version on linux
#125
Jackl-o-o-l
closed
1 year ago
4
Error about `causal_product_cpu.cpython-38-darwin.so` on Mac
#124
XiaoqZhang
opened
1 year ago
2
Got different result for the same batch
#123
gaoshan2006
closed
1 year ago
1
Fix #106 and #121
#122
wl2776
closed
1 year ago
1
Windows installation - building wheel error
#121
BenoitDalFerro
closed
1 year ago
27
Detailed implementation of `clustered_sparse_dot_product`
#120
HanielF
opened
1 year ago
0
Understanding how to define key, query and value for the cross attention calculation
#119
neuronphysics
opened
1 year ago
0
Example for NLP
#118
Bachstelze
opened
1 year ago
0
Cuda version
#117
jiaji-huang
opened
1 year ago
1
Speed of recurrent model
#116
mads-oestergaard
closed
2 years ago
2
can offer built code for linus?
#115
li-car-fei
closed
2 years ago
0
Can't officially save Linear Attention model
#114
maulberto3
opened
2 years ago
2
Any decoder example?
#113
ahmedraza1996
opened
2 years ago
1
Installing error on linux
#112
xxmlala
closed
2 years ago
4
Casual attention is cheating by looking in the future
#111
jogardi
closed
2 years ago
1
Runtime error on causal_product_cpu on GCC/G++ 11
#110
lsisoft
opened
2 years ago
3
how causal mask constructed in training batch model with linear causal attention?
#109
Howuhh
opened
2 years ago
0
Parallel complexity of Linear Attention is O(N)?
#108
haozheji
opened
3 years ago
1
Training Language Model
#107
lucasnfe
closed
3 years ago
2
Windows installation - Building wheel
#106
MaximeHoude
closed
1 year ago
3
causal-linear do not use attn_mask ?
#105
davidliujiafeng
opened
3 years ago
1
CUDA error: CUBLAS_STATUS_INVALID_VALUE
#104
huu4ontocord
closed
3 years ago
1
layernorm eps is not copied properly when cloning HF_Bert
#103
huu4ontocord
opened
3 years ago
2
For recurrent models, are positional embeddings required?
#102
rongcuid
closed
3 years ago
5
Mask and QK not of the same shape ?
#101
Baldwin-disso
closed
3 years ago
1
Huggingface Bert vs. Fast Transformer full attention
#100
lipmem
closed
3 years ago
9
Quick start raise a ModuleNotFoundError
#99
CaoYiqingT
closed
3 years ago
2
local_dot_product_cuda fails when queries and keys have different lengths
#98
tridao
opened
3 years ago
0
Installation failed on Windows
#97
WRKULOL
closed
2 years ago
2
Can't import causal_product_cuda
#96
15805383399
closed
3 years ago
1
support of cluster attention
#95
TianhaoFu
closed
3 years ago
1
TypeError: forward() missing 3 required positional arguments: 'attn_mask', 'query_lengths', and 'key_lengths'
#94
TianhaoFu
closed
3 years ago
1
ModuleNotFoundError: No module named 'aggregate.aggregate_cpu'
#93
TianhaoFu
closed
3 years ago
2
installation error
#92
davidliujiafeng
closed
3 years ago
6
Question over cuda implementation of causal product (forward)
#91
thomasw21
closed
3 years ago
1
Simplify and 3x Speedup CausalDotProduct CUDA Kernel for Larger Hidden Sizes
#90
qibinc
closed
3 years ago
1
pip install and c++ compilation error, then name 'compute_hashes_cuda' is not defined
#89
nikjetchev
closed
3 years ago
3
Make fast-transformers JIT Compilable
#88
AndriyMulyar
opened
3 years ago
1
[FAVOR & friends] Orthogonal random matrix not uniformly drawn
#87
blefaudeux
closed
3 years ago
3
allow to specify activation in transformer
#86
ZhiyuanChen
closed
3 years ago
2
Implementing clustering function of clustered_attention with Python.
#85
mHsuann
closed
3 years ago
2
Tips and tricks for training linear_att
#84
gaceladri
closed
3 years ago
7
CUDA version and CausalDotProduct time
#83
caffeinetoomuch
closed
3 years ago
4
Next