NaN ppl when running on Llama2-7b without owq (--wbits 4 --target_bit 4).

Hi Author,

I want to test the ppl of OPTQ without owq on your code, but I get a NaN. Could you tell me what I can do? The test command line is: "python main.py meta-llama/Llama-2-7b-hf c4 --wbits 4 --target_bit 4"

The console log is as below:

error 58015.34375
Quantizing model.layers.29.self_attn.o_proj
time 2.63
error 3950.656494140625
Quantizing model.layers.29.mlp.gate_proj
time 4.79
error 111899.75
Quantizing model.layers.29.mlp.up_proj
time 4.79
error 101437.4375
Quantizing model.layers.29.mlp.down_proj
time 7.13
error 33238.73046875
Quantizing model.layers.30.self_attn.q_proj
time 2.66
error 116491.953125
Quantizing model.layers.30.self_attn.k_proj
time 2.65
error 87742.4375
Quantizing model.layers.30.self_attn.v_proj
time 2.63
error 64090.1484375
Quantizing model.layers.30.self_attn.o_proj
time 2.65
error 4883.51513671875
Quantizing model.layers.30.mlp.gate_proj
time 4.79
error 115987.59375
Quantizing model.layers.30.mlp.up_proj
time 4.81
error 102952.34375
Quantizing model.layers.30.mlp.down_proj
time 6.98
error nan
Quantizing model.layers.31.self_attn.q_proj
time 2.60
error nan
Quantizing model.layers.31.self_attn.k_proj
time 2.65
error nan
Quantizing model.layers.31.self_attn.v_proj
time 2.65
error nan
Quantizing model.layers.31.self_attn.o_proj
time 2.60
error nan
Quantizing model.layers.31.mlp.gate_proj
time 4.80
error nan
Quantizing model.layers.31.mlp.up_proj
time 4.80
error nan
Quantizing model.layers.31.mlp.down_proj
time 7.07
error nan
Running Time : 1751.8
wikitext2
Evaluating ...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [01:13<00:00,  2.30s/it]
nan
ptb
Evaluating ...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:27<00:00,  1.17it/s]
nan
c4
Evaluating ...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [02:01<00:00,  3.80s/it]
nan

Thanks

xvyaward / owq

NaN ppl when running on Llama2-7b without owq (--wbits 4 --target_bit 4). #2