Open padeirocarlos opened 10 months ago
When optimizing the reconstruction module, the rounding loss is large at the beginning and gradually converges to 0 as the optimization process proceeds.
Furthermore, as the module gets deeper, the final value of the rounding loss gradually increases. If the problem is not resolved, you can provide a specific loss distribution.
Sorry! For this question! I did not follow clearly this statement.
"Furthermore, as the module gets deeper, the final value of the rounding loss gradually increases. If the problem is not resolved, you can provide a specific loss distribution"
Could you elaborate more please?! What do you mean here?
Sorry! For this question! I did not follow clearly this statement.
"Furthermore, as the module gets deeper, the final value of the rounding loss gradually increases. If the problem is not resolved, you can provide a specific loss distribution"
Could you elaborate more please?! What do you mean here?
Since the model is optimized by blockwise reconstruction, each block of the model has an optimization process and a convergence loss. In the preceding blocks of the model, the convergence loss of the block optimization tends to be 0. In the deeper blocks of the model, the optimized convergence loss remains at a larger value. At low-bit quantization, the value of the convergence loss in the deeper blocks is even larger. Such a phenomenon is normal in the ImageNet dataset.
That is true. Also I noted that in my dataset and ImageNet! I think one of the reason is the quantization accumulation error which become more bigger in deeper blocks or layer! So providing a specific loss distribution can solve this issues?! Have tried that? I will try it!
I tried running your code for with a pre-trained ResNet50 and MobilieNetV2 model. I got loss function value for output and pred losses:
rec_loss = lp_loss(pred, tgt, p=self.p) :param pred: output from quantized model :param tgt: output from FP model :return: total loss function https://github.com/bytedance/MRECG/blob/main/MRECG.py#L215C10-L225
https://github.com/bytedance/MRECG/blob/main/MRECG.py#L215C10-L225
Are there additional settings I missed?