Closed tamltlkdn closed 5 months ago
Hi @tamltlkdn , We mention the default choice of KD_WEIGHT=9 in the supplementary materials (Section "2. Implementation Details"in page 2). The phrase of "the same experimental settings as previous works [5, 17, 50]" means the common settings including the teacher/student pairs, training epochs, learning rate, optimizer, etc. The references [5, 17, 50] denotes ReviewKD, MLKD, and DKD (they follow the same settings of training epochs, etc.), not saying that we specifically follow the KD_WEIGHT choices of DKD.
Hi authors, In DKD+logit_stand of this implementation, I observe that the alpha and beta are multiplied by kd_weight (=9). Why kd_weight is 9 not 1 (default)? Didn't you say in the paper that "We follow the same experimental settings as previous works"?