HobbitLong / RepDistiller

[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
BSD 2-Clause "Simplified" License
2.11k stars 389 forks source link

The sampler is not consistent with the original implementation of CCKD #18

Closed winycg closed 4 years ago

winycg commented 4 years ago

hi,why the sampler (CUR or SUR) is not consistent with the original implementation of CCKD (Correlation Congruence for Knowledge Distillation)? And I would like to know what delta[:-1] * delta[1:] denotes? Thanks!

HobbitLong commented 4 years ago

@winycg , that's the code the original author shared with me. You can find the snippets that I commented out, which are my re-implementation according to the paper.