Open LiMa-cas opened 4 months ago
Thank you for your interest in our work. MixDQ is a PTQ method that does not require tuning, the code in the base_quantizer.py
is simply for compatibility.
thanks a lot !
At 2024-07-20 13:00:47, "tianchen" @.***> wrote:
Thank you for your interest in our work. MixDQ is a PTQ method that does not require tuning, the code in the base_quantizer.py is simply for compatibility.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
path: "/share/public/diffusion_quant/calib_dataset/bs32_t30_sdxl.pt" HI, where can I download this file?i need all the file download
You could generate this file, following the instruction of README.md step 1.1 "Generate Calibration Data"
CUDA_VISIBLE_DEVICES=$1 python scripts/gen_calib_data.py --config ./configs/stable-diffusion/$config_name --save_image_path ./debug_imgs
thanks a lot. another question, when I reference, is it much slower since I need if else to see which precision to dequantize?
I'm not quite sure I fully understand your question. But Yes, the code within this repository is the "algorithm-level" quantization simulation code, and runs slower than FP16. For actual speedup, customized CUDA kernel that utilizes the INT computation is needed (our huggingface demo code, https://huggingface.co/nics-efc/MixDQ ).
in the base_quantizier.py, there are these words:PyTorch Function that can be used for asymmetric quantization (also called uniform affine quantization). Quantizes its argument in the forward pass, passes the gradient 'straight through' on the backward pass, ignoring the quantization that occurred. Based on https://arxiv.org/abs/1806.08342., So is MixDQ a PTQ or QAT method?need backward pass when quantizationing?