Closed BlueCat7 closed 2 years ago
Thanks for your attention to our work!
It is normal since the floating-point number arithmetic produces error, which is related to different version of CUDA and PyTorch, as well as batch size.
The knowledge distillation is robust for the arithmetic error. Although the diff_rate
may be not 0, the saved logits can be used for the distillation without loss of accuracy.
Thanks for your attention to our work!
It is normal since the floating-point number arithmetic produces error, which is related to different version of CUDA and PyTorch, as well as batch size.
The knowledge distillation is robust for the arithmetic error. Although the
diff_rate
may be not 0, the saved logits can be used for the distillation without loss of accuracy. Ok, thanks for your reply.
Dear Author, Thanks for your great work! I generate the logits of CLIP and then put the logits to another meachine to check it, but diff_rate can't be 0. When I use the same machine to generate logits and check it, it's ok, the diif_rate is 0. So I am confused about where may be wrong. Looking forward to your reply, thanks!