Closed tigereatsheep closed 4 years ago
Thank you for being interested in our work! Yes, it can be used for per-kernel pruning if you compute and apply the mask matrix by kernel-wise aggregation and broadcasting. Connection pruning (i.e., sparsification) is focused on reducing the number of non-zero params, not the number of FLOPs, unless we run the model on some specialized software and hardware platforms.
thks
First thank you for your awesome work. I have a 2 questions: