Closed kaikai123456789 closed 1 year ago
It seems the original repo of DER also does not cotain the code for masking and pruning. The reproduction in this repo only contains the network expansion process, which is proven to work robustly in most cases.
We noticed that the original DER paper mentions the “Our pruning method is based on differentiable channel-level masks, which is adapted from HAT” section, which discusses masking. Therefore, there is some confusion. Thank you very much for your answer.
Hello, I would like to ask where the mask is located in the DER code? thanks