issues
search
turboderp
/
exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.66k
stars
214
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Tried to build setup exllama but encountering ninja related errors, can someone please help me?
#258
BwandoWando
opened
10 months ago
3
stop-string support?
#257
krypterro
opened
10 months ago
2
Request: Some improvements to web app.py
#256
Midaychi
opened
10 months ago
0
refine json dicts for ws example
#255
Kerushii
closed
10 months ago
0
Bad output for 2080 ti
#254
filipemesquita
opened
10 months ago
1
GPU Usage Keeps High Even Without Inference Load
#253
leonxia1018
opened
10 months ago
7
Is it possible to do batch generate?
#252
fahadh4ilyas
opened
10 months ago
7
Are we *really* using nvlink?
#251
Ph0rk0z
closed
10 months ago
1
recover unsaved modification
#250
Kerushii
closed
10 months ago
3
ws example for streaming with context reuse and token testing
#249
Kerushii
closed
10 months ago
0
Custom multiple stop token (for roleplay / conversation)
#248
wangerzi
closed
10 months ago
6
Possible to load model with low system ram?
#245
gros87
opened
10 months ago
4
RuntimeError: temp_state buffer is too small
#244
daniel-kukiela
closed
10 months ago
1
Modify generator.py > generate_simple to accept encode_special_characters?
#243
zmarty
opened
11 months ago
1
Header too large error when running benchmark
#242
DKormann
closed
10 months ago
2
Is there a way to make compress_pos_emb dynamic?
#241
fahadh4ilyas
closed
10 months ago
2
Can max_seq_len be set via CLI or GUI in webui?
#240
int19h
closed
10 months ago
2
KV caching?
#238
bryanhpchiang
opened
11 months ago
2
Continuous Batching support
#237
FireMasterK
opened
11 months ago
0
Generation uses config.max_seq_len instead of default 2048
#236
flotos
closed
11 months ago
1
Question about example_flask.py
#235
ZeroYuJie
opened
11 months ago
1
Question about sampling and kernel fusion
#234
sleepwalker2017
closed
11 months ago
6
RuntimeError with airoboros-l2-13b
#233
corv89
closed
11 months ago
2
Strange output / doesn't make any sense
#232
lordwebbie
closed
11 months ago
5
Slower tokens/s than expecting
#231
teknium1
opened
11 months ago
14
Support for NF4?
#230
hoagy-davis-digges
opened
11 months ago
1
[Bug]: Sampling fails when temperature is 0
#226
kogolobo
opened
11 months ago
4
Hangs after reboot caused by TrippleFault.
#225
SolsticeProjekt
closed
11 months ago
3
Fix HIP on recent PyTorch version
#224
ardfork
closed
11 months ago
0
custom stop tokens in generator.py
#223
Kerushii
closed
10 months ago
1
Please handle the case your logits contain nans
#222
ParisNeo
opened
11 months ago
1
Llama 2 Chat implementation
#221
SinanAkkoyun
opened
11 months ago
10
Weird issue with context length
#220
zzzacwork
opened
11 months ago
6
Which LLama model do you use? Could you give a download link?
#219
sleepwalker2017
closed
11 months ago
3
Speculative decoding?
#218
bryanhpchiang
opened
11 months ago
17
Very bad response
#217
pourfard
closed
10 months ago
9
Reply is too short
#216
hengjiUSTC
closed
11 months ago
4
How to change extend context with llama2?
#215
ShahZ181
closed
11 months ago
3
Question about storing models in Container
#214
JacobGoldenArt
opened
11 months ago
2
[Feature Request] OpenAI-compatible API
#212
langchain4j
closed
11 months ago
11
"temp_state buffer is too small" when using LLama 13b at full context length
#211
anujnayyar1
closed
11 months ago
5
Add example of max seq length configuration
#210
vadi2
closed
11 months ago
2
compile kernel
#209
xiaoxiangshusheng
closed
11 months ago
1
Unable to split across multiple AMD GPUs
#208
TNT3530
closed
11 months ago
4
Infinities during model evaluation
#207
50h100a
closed
11 months ago
8
How to shard model and batched cache equally?
#206
nivibilla
closed
11 months ago
4
Can't assign model to multi gpu
#205
nivibilla
closed
11 months ago
1
Latency grows substantially as batch size increases, even with small batch sizes
#202
joehoover
opened
11 months ago
2
fixed seed doesn't work on ooba's webui
#201
BadisG
closed
11 months ago
3
.
#200
mrbianchi
closed
11 months ago
0
Previous
Next