princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
MIT License
188 stars 32 forks source link

The initial of 'intermediate' loga #51

Open secretu opened 10 months ago

secretu commented 10 months ago

Hi, I noticed the initial of the 'intermediate_z' is different from others, which will introduce a initial sparsity in mlp layer. I wonder why did this different initial step.

https://github.com/princeton-nlp/CoFiPruning/blob/da855a809c4a15e1c964a47a37998db2e1a226fd/models/l0_module.py#L147C8-L147C39

https://github.com/princeton-nlp/CoFiPruning/blob/da855a809c4a15e1c964a47a37998db2e1a226fd/models/l0_module.py#L134C9-L134C9

xiamengzhou commented 10 months ago

Hi, I believe that it's mostly a typo. I also vaguely remember having an initial sparsity does not affect performance much!