Vahe1994 SpQR issues - Githubissues

Vahe1994 / SpQR

Apache License 2.0

513 stars 40 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Any update on inference code?

#44 nate-sanders opened 6 months ago
0
LLaMa 30B loading error

#43 DavidePaglieri closed 7 months ago
1
has the inference code released?

#42 singingtower opened 7 months ago
0
Fixed bug in lmeval.py caused by --save argument

#41 Godofnothing closed 7 months ago
1
Yi support and cleanup

#40 poedator closed 7 months ago
0
Which dataset should I use?

#39 ccccj opened 8 months ago
1
Added OPT family model support

#38 Godofnothing closed 9 months ago
0
Auto style check

#37 Vahe1994 closed 10 months ago
1
Revert "Added automatic style check"

#36 Vahe1994 closed 10 months ago
0
Added automatic style check

#35 Vahe1994 closed 10 months ago
3
Refactored with black, line_length=120

#34 Godofnothing closed 10 months ago
0
Doesn't seem to work for Baichuan-7B

#33 CPegasus closed 11 months ago
0
Inference part 1/n (statistic saving)

#32 Vahe1994 closed 10 months ago
2
Does permutation order have to be included when saving the quantized model?

#31 luccareinehr closed 11 months ago
2
Outlier mask is still permuted when returned

#30 luccareinehr closed 11 months ago
1
Datautils upd 2

#29 poedator closed 11 months ago
0
Drop --no_cache arg in lmeval.py

#28 poedator closed 11 months ago
0
improved offload_activations

#27 poedator closed 11 months ago
1
undo low_cpu_mem_usage=True in huggingface.py

#26 poedator closed 11 months ago
0
fixed RefinedWeb vs RefinedWebModel in modelutils.py and main.py

#25 poedator closed 11 months ago
0
How evaluation being done without storing quantized weights?

#24 tahmiddialpad closed 10 months ago
1
Rollback model loading to match the code from the paper

#23 justheuristic closed 1 year ago
3
CUDA out of memory falcon-40b when using 40Gi A100 GPU

#22 caleb-artifact opened 1 year ago
1
Process killed after eval phase

#21 Iambestfeed opened 1 year ago
4
refactor of datautils

#20 poedator closed 11 months ago
1
Post Quantization for nllb-models

#19 Arnab1181412 opened 1 year ago
1
Reason for permutation and weights after it is inverted

#18 NRodion closed 1 year ago
4
Evaluation code for Falcon models

#17 Amshaker opened 1 year ago
1
[ptb perplexity is different from paper]

#16 Amshaker opened 1 year ago
8
Provide SpQR trained model weights on OpenLLaMA?

#15 razor08 opened 1 year ago
1
SqueezeLLM

#14 Iambestfeed opened 1 year ago
2
Fix readme typo

#13 erjanmx closed 1 year ago
0
How to test inference speed?

#12 JianbangZ opened 1 year ago
4
Why no save?

#11 yhyu13 closed 1 year ago
2
fix the missing of import get_loaders

#10 geekinglcq closed 1 year ago
1
model downloading

#9 AmeenAli closed 1 year ago
1
Refactor

#8 Godofnothing closed 1 year ago
0
Deduplication

#7 poedator closed 1 year ago
0
Will the function of model saving be realized in the future?

#6 ShaunHeNJU opened 1 year ago
2
Fix memory inefficiency for baseline (GPTQ) configuration

#5 justheuristic closed 1 year ago
1
Refactor parameters

#4 Vahe1994 closed 1 year ago
1
Update README.md

#3 eltociear closed 1 year ago
0
Readme: Update paper title and add hyperlink

#2 EwoutH closed 1 year ago
0
Can I save the compressed model for direct inference only?

#1 SparkJiao closed 1 year ago
3