ModelTC / TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
https://modeltc.github.io/TFMQ-DM/
Apache License 2.0
55 stars 4 forks source link

Where is the code to implement the Finite set calibration? #5

Closed csguoh closed 4 months ago

csguoh commented 4 months ago

Hi, Thanks for this work! In the paper, the Finite set calibration is used which uses a timestep-aware scale factor to quant the activation in the embedding_layers and time_embed. I tried to run the calibration of txt2img.py, and have two questions.

  1. It seems that the disable_out_quantizaion function have set the use_aq of the time_embed to False, therefore the time_embed layer is actually not used for act-quant. So I am confused since the paper states the time_embed is also act-quantized.

  2. Where can I find the code to implement the timestep-aware scale factor for act quantization, I have searched the whole project but still can not find timesetp-spectific scale s for act-quant. May I get your help please?

Thx.

Harahan commented 4 months ago

Hi,

For the first question, we choose the first linear/conv layer, e.g., time_embed with another conv, to be full-precision as described in the experimental setting.

To answer the second question, we implement the algorithm here: https://github.com/ModelTC/TFMQ-DM/blob/1068e8aa0aa477a59a8afd01a99a906b0f88a8f2/quant/calibration.py#L112 We employ a time-aware scale factor for the whole model in this version since this type of calibration with min-max can help accelerate the quantization process. This incurs negligible impact on performance compared to only applying that to our Temporal Information Block with LSQ to other components.

csguoh commented 4 months ago

Hi, Thanks for your reply. I have found the corresponding code for timestep-specific scale with your hint. However, as for the ckpt-saving, I cannot figure it out how to use the timestep specific act quant. I have noticed this line is used to store the time-aware scale, but in the load_cali_model function, I cannot find the corresponding load code. Can you give me some hints about where to use these timestep-aware scale factors during inference?

Thx.