princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
MIT License
188 stars 32 forks source link

Bug or intent? #53

Closed mpiorczynski closed 9 months ago

mpiorczynski commented 10 months ago

Hi! I have a question, why are you checking if the MHA layer is not pruned in this line, unlike here where you check for the FFN layer? Since distillation happens at the outputs of FFN layers, shouldn't the check be for the presence of FFN instead of MHA? Is this intentional or potentially a bug?

xiamengzhou commented 10 months ago

It seems to be a bug to me. The correct way to define existing layers should be existing_layers = (head_layer_z != 0) | (mlp_z != 0). We mostly used version 3 during our development and didn't test version 4 much. Thanks for pointing this out!

mpiorczynski commented 9 months ago

Thanks a lot!