microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
354 stars 31 forks source link

What is the number of parameters afrer slicing? #165

Closed yaya-sy closed 2 months ago

yaya-sy commented 3 months ago

Hi, thank you for sharing your work.

I was wondering, after slicing a model for example at 20%, what is the number of parameters of the resulting model? It is number_of_parameters_of_base_model - 20% ?

Thank you a lot.

MrGGLS commented 2 months ago

Hi @yaya-sy, Although the amount of parameters removed is approximately the same as the ratio set by the slice, the actual number of parameters in the model is larger due to the introduction of additional Q matrices. You can check the code below in the run_slicegpt_perplexity.py

sliced_param_count = sum(int(p.nelement()) for p in model.parameters())
sliced_fraction = 1.0 - sliced_param_count / original_param_count
logging.info(f'Sliced model parameters: {sliced_param_count:,d} (sliced fraction {sliced_fraction:.4f})')

The sliced_fraction here represents the actual pruning ratio.

yaya-sy commented 2 months ago

Hi, yes, I see. Thank you very much, and thank you to the authors for releasing the code. Very useful! I close this issue.