princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
MIT License
192 stars 31 forks source link

An issue when reproducing the efficiency evaluation #39

Closed ROIM1998 closed 1 year ago

ROIM1998 commented 1 year ago

Hi @xiamengzhou. When reproducing the efficiency evaluation of the [CoFi-MNLI-s95] model on a single NVIDIA A100 graphic card, it shows that the model's speed is 8.8e-05 seconds/example, where the vanilla fine-tuned BERT's speed is 4.6e-04 seconds/example, meaning that the speedup is only about 5.23× instead of 12.1×. Could it be possible that the decrease in speedup comes from the difference in the hardware? Are there any other possible reasons that may cause the difference in efficiency testing? Many thanks!

The output for CoFi-MNLI-s95 testing:

image

The output for fine-tuned BERT testing:

image
xiamengzhou commented 1 year ago

Hi, sorry for the late reply! Yes, I think the number differs in different hardware. We tested on V100 instead of A100 at the time and it could be that A100 is more optimized for similarly shaped structures.