issues
search
Vahe1994
/
SpQR
Apache License 2.0
513
stars
40
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Any update on inference code?
#44
nate-sanders
opened
6 months ago
0
LLaMa 30B loading error
#43
DavidePaglieri
closed
7 months ago
1
has the inference code released?
#42
singingtower
opened
7 months ago
0
Fixed bug in lmeval.py caused by --save argument
#41
Godofnothing
closed
7 months ago
1
Yi support and cleanup
#40
poedator
closed
7 months ago
0
Which dataset should I use?
#39
ccccj
opened
8 months ago
1
Added OPT family model support
#38
Godofnothing
closed
9 months ago
0
Auto style check
#37
Vahe1994
closed
10 months ago
1
Revert "Added automatic style check"
#36
Vahe1994
closed
10 months ago
0
Added automatic style check
#35
Vahe1994
closed
10 months ago
3
Refactored with black, line_length=120
#34
Godofnothing
closed
10 months ago
0
Doesn't seem to work for Baichuan-7B
#33
CPegasus
closed
11 months ago
0
Inference part 1/n (statistic saving)
#32
Vahe1994
closed
10 months ago
2
Does permutation order have to be included when saving the quantized model?
#31
luccareinehr
closed
11 months ago
2
Outlier mask is still permuted when returned
#30
luccareinehr
closed
11 months ago
1
Datautils upd 2
#29
poedator
closed
11 months ago
0
Drop --no_cache arg in lmeval.py
#28
poedator
closed
11 months ago
0
improved offload_activations
#27
poedator
closed
11 months ago
1
undo low_cpu_mem_usage=True in huggingface.py
#26
poedator
closed
11 months ago
0
fixed RefinedWeb vs RefinedWebModel in modelutils.py and main.py
#25
poedator
closed
11 months ago
0
How evaluation being done without storing quantized weights?
#24
tahmiddialpad
closed
10 months ago
1
Rollback model loading to match the code from the paper
#23
justheuristic
closed
1 year ago
3
CUDA out of memory falcon-40b when using 40Gi A100 GPU
#22
caleb-artifact
opened
1 year ago
1
Process killed after eval phase
#21
Iambestfeed
opened
1 year ago
4
refactor of datautils
#20
poedator
closed
11 months ago
1
Post Quantization for nllb-models
#19
Arnab1181412
opened
1 year ago
1
Reason for permutation and weights after it is inverted
#18
NRodion
closed
1 year ago
4
Evaluation code for Falcon models
#17
Amshaker
opened
1 year ago
1
[ptb perplexity is different from paper]
#16
Amshaker
opened
1 year ago
8
Provide SpQR trained model weights on OpenLLaMA?
#15
razor08
opened
1 year ago
1
SqueezeLLM
#14
Iambestfeed
opened
1 year ago
2
Fix readme typo
#13
erjanmx
closed
1 year ago
0
How to test inference speed?
#12
JianbangZ
opened
1 year ago
4
Why no save?
#11
yhyu13
closed
1 year ago
2
fix the missing of import get_loaders
#10
geekinglcq
closed
1 year ago
1
model downloading
#9
AmeenAli
closed
1 year ago
1
Refactor
#8
Godofnothing
closed
1 year ago
0
Deduplication
#7
poedator
closed
1 year ago
0
Will the function of model saving be realized in the future?
#6
ShaunHeNJU
opened
1 year ago
2
Fix memory inefficiency for baseline (GPTQ) configuration
#5
justheuristic
closed
1 year ago
1
Refactor parameters
#4
Vahe1994
closed
1 year ago
1
Update README.md
#3
eltociear
closed
1 year ago
0
Readme: Update paper title and add hyperlink
#2
EwoutH
closed
1 year ago
0
Can I save the compressed model for direct inference only?
#1
SparkJiao
closed
1 year ago
3