Closed DHKim0428 closed 2 months ago
Hi, you can tackle this by changing the 'channel_wise argument' to True in aq_params = { 'n_bits': opt.act_bit, 'symmetric': opt.a_sym, 'channel_wise': True, 'scale_method': a_scale_method, 'leaf_param': opt.quant_act }. In practice, this is actually token-wise quantization which do not influence efficiency. Line 474 in the code deals with this issue.
However, the experiment I did above was conducted under the 'channel_wise'=True setting... (as mentioned in line 474)
Try excluding the --running_stat argument. I think this is probably the issue.
Thanks a lot, I will try it.
It works!
Hi,
I executed the W4A4 LSUN-Bedroom calibration using the following command:
python sample_diffusion_ldm_bedroom.py -r models/ldm/lsun_beds256/model.ckpt -n 100 --batch_size 20 -c 200 -e 1.0 --seed 40 --ptq --weight_bit 4 --quant_mode qdiff --cali_st 20 --cali_batch_size 32 --cali_n 256 --quant_act --act_bit 4 --a_sym --a_min_max --running_stat --cali_data_path <cali_data_path> -l <output_path>
.I observed that it nicely works under the W4A8 setting but produces unexpected results in the W4A4 setting. Could you please advise if there are any specific hyperparameters or configurations that need adjustment in the default code to address this problem?
Here are my results.
W4A4 Setting.
W4A8 Setting.
![sample_000002](https://github.com/hatchetProject/QuEST/assets/31202714/047d0425-f486-4b1a-962d-4ee11b99d31a)