Open xiaozhimabing opened 3 years ago
Because it considers the correlation among pixels. If the unary part is hard to learn or can not be trained effectively, employing the structure KD will help training. I tend to choose some deeper features. Because abstract semantics makes more sense. Besides, the spatial size is smaller which is more efficient.
thank you!!!
I want to know why structure knowledge distillation is effective and how it can be used for regression tasks?How to choose the intermediate feature maps for pair-wise knowledge distillation? Is there anyone can help me for this question?