facebookresearch LLM-QAT issues

facebookresearch / LLM-QAT

Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"

Other

249 stars 24 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Does LLM-QAT support group-wise quantization?

#32 mxjmtxrm opened 1 month ago
0
运行报错elif self.deepspeed:

#31 LSB0798 opened 2 months ago
0
How to run inference

#30 wangkuiyi opened 3 months ago
0
Looks like an Incorrect READ ME file

#29 gdsaikrishna closed 3 months ago
1
Question about the training cost

#28 KimythAnly closed 3 months ago
1
Does this method support chat models as well as Llama-2 models?

#27 Saoyu99 opened 6 months ago
1
How long will it take to train

#26 XA23i opened 10 months ago
1
Accuracy

#25 yileijin opened 11 months ago
0
Suggest change the README

#24 jingyao-zhang closed 12 months ago
1
FileNotFoundError: [Errno 2] No such file or directory: 'wiki2.jsonl'

#23 StiphyJay opened 1 year ago
1
Training is not working.

#22 XinnuoXu opened 1 year ago
0
Hi, 可以开源你们生成的训练数据吗，感谢！

#21 Xingrun-Xing opened 1 year ago
1
docs: blockquote cite article format README

#20 guspan-tanadi closed 9 months ago
0
Inconsistent results with LLM.int8() and SmoothQuant papers

#19 fxmarty closed 1 year ago
1
Questions about the valid_dataset format

#18 TravisL24 closed 1 year ago
1
Questions about the valid_dataset format

#17 TravisL24 closed 1 year ago
0
run run_train.sh, CUDA out of memory

#16 priscilla-pan closed 1 year ago
0
no smoothquant in QuantizeLinear

#15 priscilla-pan closed 1 year ago
5
Is there an efficient way to generate data?

#14 benyang0506 closed 1 year ago
3
The choice of kd_loss_scale

#13 zhanlaoban closed 1 year ago
1
can you provide inference example for QuantizeLinear in 8 8 8

#12 jackzhou121 closed 1 year ago
2
How should I save the 8 bit model？

#11 liguodongiot closed 1 year ago
2
If 4-8-8 is used to do QAT, how to process weight in inference?

#10 jackzhou121 closed 1 year ago
1
No randomization operation for the first token in data generation phrase.

#9 xingyueye closed 1 year ago
1
Why use clip_tensor[-2.0, 2.0] in the backward?

#8 jackzhou121 closed 1 year ago
1
APEX and FSDP can not run

#7 jackzhou121 closed 1 year ago
1
why generated data was not used

#6 jackzhou121 closed 1 year ago
1
Does this method support Bloom？

#5 18140663659 closed 1 year ago
1
Expects full precision but got torch.bfloat16 error

#4 liguodongiot opened 1 year ago
1
Harcoded train paths and configuration for table in readme

#3 aitorormazabal closed 1 year ago
1
Why do smoothquant dynamically in the forward() function of QuantizeLinear layer

#2 Starmys closed 1 year ago
2
Change teaser image to relative path to properlly display on Github

#1 Lyken17 closed 1 year ago
0