issues
search
qwopqwop200
/
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
Apache License 2.0
2.99k
stars
459
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Porting GPTQ to CPU?
#240
yiliu30
opened
1 year ago
2
AttributeError: module 'torch.nn.functional' has no attribute 'scaled_dot_product_attention'
#239
leszekhanusz
closed
1 year ago
2
adding missing transformers import to opt.py old-cuda
#238
YellowRoseCx
closed
1 year ago
0
adding missing transformers import to opt.py cuda
#237
YellowRoseCx
closed
1 year ago
0
6-bit quantization
#236
philipturner
opened
1 year ago
1
Add -O3 flag to nvcc
#235
Noir-Lime
closed
1 year ago
1
Giepeto
#234
IsaacGanon
closed
1 year ago
0
fastest-inference-4bit fails to build
#233
lee-b
closed
1 year ago
3
no module named quant_cuda (fastest-inference-4bit branch)
#232
joshlevy89
opened
1 year ago
1
Benchmark broken on H100
#231
FrederikAbitz
opened
1 year ago
0
question about the zero_point
#230
irasin
opened
1 year ago
0
running on old gpu with fp32 only
#229
DeoLeung
opened
1 year ago
3
How to inference llama-65b-4bit on mulgpu
#228
Minami-su
closed
1 year ago
6
Result with the branch `fastest-inference-4bit`
#227
alanxmay
closed
1 year ago
11
where to get /path/to/downloaded/llama/weights
#226
SeekPoint
opened
1 year ago
0
About the fine-grained of weight quantization
#225
xingyueye
opened
1 year ago
0
OpenCL support
#224
apcameron
opened
1 year ago
1
Bump protobuf from 3.20.0 to 3.20.2
#223
dependabot[bot]
closed
1 year ago
0
update to protobuf version used in the tokenizer
#222
openloop
closed
1 year ago
0
Better, faster, smaller rotary embedding implementation in Triton.
#221
aljungberg
closed
1 year ago
1
Errors to compile with CUDA 12.1
#220
fcolecumberri
closed
1 year ago
2
Error on A100,device kernel image is invalid
#219
lileilai
opened
1 year ago
0
Multi-GPU, allocate output tensor on input tensor's device
#218
Lunderberg
closed
1 year ago
0
Assertion `!(srcMmaLayout && dstMmaLayout) && "Unexpected mma -> mma layout conversion"' failed.
#217
chigkim
opened
1 year ago
2
CUDA kernel sync problem
#216
chu-tianxiang
closed
1 year ago
1
wbit=16 Conversion Gives Error
#215
sawradip
opened
1 year ago
2
CUDA Benchmark on 2bit, 3bit, 4bit models - Why 3bit slower than 4bit, but faster than 2biit?
#214
sawradip
closed
1 year ago
1
4bits on 65B
#213
jear
closed
1 year ago
1
explicitly declare wbits and group_size
#212
cauyxy
closed
1 year ago
0
How can I get the gradient when using 4bits model?
#211
Joanna-0421
opened
1 year ago
0
IndexError: tensors used as indices must be long, byte or bool tensors
#210
Pathos14489
opened
1 year ago
2
CUDA error: unknown error (Error when quantize llama Model)
#209
ostix360
opened
1 year ago
1
Add --layers-dist to define layers distribution across multi-gpus
#208
Thireus
closed
1 year ago
0
neox.py generates randrange() error
#207
GenTxt
closed
1 year ago
13
Security Issue: This Auto-downloads 800 trojan viruses
#206
freckletonj
closed
1 year ago
2
CUDA: 8bit quantized models are stupid.
#205
Ph0rk0z
opened
1 year ago
4
File "<string>", line 21, in matmul_248_kernel
#204
moophlo
opened
1 year ago
0
Fix NameError: name 'math' is not defined
#203
Thireus
closed
1 year ago
0
sharing gpu tensors across processes +devicemap
#202
xloem
closed
1 year ago
0
cuda/quant.py: respect device index
#201
xloem
closed
1 year ago
0
fix bug
#200
qwopqwop200
closed
1 year ago
0
Fix: TabError
#199
USBhost
closed
1 year ago
0
NameError: name 'transformers' is not defined
#198
catalpaaa
closed
1 year ago
2
style(project): format with yapf
#197
tpoisonooo
closed
1 year ago
5
feat(llama.py): add SNR error
#196
tpoisonooo
closed
1 year ago
1
WIP: feat(llama.py): quantize input
#195
tpoisonooo
closed
1 year ago
2
Fix NameError: name 'transformers' is not defined
#194
Thireus
closed
1 year ago
0
llama 30b generates strange answers after quantizing to 4bit
#193
pzzmyc
closed
1 year ago
1
why disable tf32 ?
#192
tpoisonooo
closed
1 year ago
4
slower inference speed
#191
MatthewCYM
closed
1 year ago
4
Previous
Next