winycg / CLIP-KD

[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation
30 stars 0 forks source link

关于教师模型输出的features归一化的问题 #5

Closed ylnxxts closed 1 month ago

ylnxxts commented 2 months ago
  我记得CLIP在算损失函数的时候需要对模型的输出进行l2归一化,您的论文中也指出,“Here, all embeddings are post-processed by l2 normalization”

 但是我在查看您的代码时,发现您在计算loss的时候只对学生模型的features进行了归一化,并没有对教师模型进行归一化,请问是我漏看了什么部分的代码吗?或者您代码的设计就是不对教师模型的feature进行归一化?

image

winycg commented 2 months ago

Both student and teacher feature are normalized. Please refer to: https://github.com/winycg/CLIP-KD/blob/f3ae18b9da8ac8570796c4e113cb8a9006dab18a/src/open_clip/model.py#L213 https://github.com/winycg/CLIP-KD/blob/f3ae18b9da8ac8570796c4e113cb8a9006dab18a/src/training/train.py#L193 https://github.com/winycg/CLIP-KD/blob/f3ae18b9da8ac8570796c4e113cb8a9006dab18a/src/training/train.py#L196