issues
search
sdan
/
selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
https://arxiv.org/pdf/2401.01325.pdf
Apache License 2.0
114
stars
2
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Attention implementation through torch.nn.functional.scaled_dot_product_attention not supported
#5
eightBEC
opened
5 months ago
0
Long input series makes oom
#4
seanxuu
opened
5 months ago
0
Plans to support Solar 10.7b as well?
#3
olsn
opened
6 months ago
1
TypeError: SelfExtendMistralAttention.apply_pos_emcode() takes 4 positional arguments but 6 were given
#2
PolyGPT
opened
6 months ago
1
License
#1
fakerybakery
opened
6 months ago
0