Closed wutaiqiang closed 2 years ago
The performance for LSP in lab_knowledge_distillation is much less than results reported in paper.
Actually, this descrease is caused by GATConv in DGL.
In the source code:
Then the code in lab file:
self.layers.append(GATConv( in_feats=h_dim*n_heads[i-1], out_feats=h_dim, num_heads=n_heads[i], residual=True, activation=nn.LeakyReLU(0.02)))
the residual=True will lead to self.res_fc = Identity(), which is not excepted.
Hence , to fix this bug, we should modify the source code into:
@wutaiqiang thank you for that! it does help increase the score, but unfortunately does not affect the gain for LSP student model wrt a full student model.
The performance for LSP in lab_knowledge_distillation is much less than results reported in paper.
Actually, this descrease is caused by GATConv in DGL.
In the source code:
Then the code in lab file:
the residual=True will lead to self.res_fc = Identity(), which is not excepted.
Hence , to fix this bug, we should modify the source code into: