princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
MIT License
192 stars 31 forks source link

Introducee random teacher layer sets #35

Closed zhangzhenyu13 closed 2 years ago

zhangzhenyu13 commented 2 years ago

I find that a fixed teacher layer sets might not be a good choice for cofi; so it would make the method more robust to introduce the random teacher sets selection. refer this: [2109.10164] RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation (arxiv.org)