Open 294486709 opened 11 months ago
In mmrazor, distill examples with CWD works on the condition that teacher/student layer has the same number of feature channels. Is there a way to apply CWD on layers with different number of filters?
I have the same problem. The other solution I see is to add the 1*1 conv to change the channel of the student model manually.
In mmrazor, distill examples with CWD works on the condition that teacher/student layer has the same number of feature channels. Is there a way to apply CWD on layers with different number of filters?