locuslab / wanda

A simple and effective LLM pruning approach.
https://arxiv.org/abs/2306.11695
MIT License
660 stars 87 forks source link

this method can be used to bloom model? #7

Open 18140663659 opened 1 year ago

Eric-mingjie commented 1 year ago

It is general for Transformer based large language models. We evaluate on LLaMA mostly in our paper because of its superior performance. We have additional results on Pythia and OPT in the appendix.

In terms of the BLOOM model, do you have a particular model in mind? If so, could you share the Hugging Face model id. I can help look into pruning this model with our approach Wanda.

18140663659 commented 1 year ago

It is general for Transformer based large language models. We evaluate on LLaMA mostly in our paper because of its superior performance. We have additional results on Pythia and OPT in the appendix.

In terms of the BLOOM model, do you have a particular model in mind? If so, could you share the Hugging Face model id. I can help look into pruning this model with our approach Wanda.

Thank you for your reply, I would like to know the cropping effects of the following models with different specifications, which are bigscience/bloom-7b1, bigscience/bloom-3b, bigscience/bloom-1b7 in huggingface models

Eric-mingjie commented 1 year ago

Hi, we have some results on Bloom models, I summarized it here (unstructured 50% sparsity):

BLOOM 560M 1.1B 1.7B 3B 7.1B
dense 22.42 17.68 15.39 13.48 11.37
magnitude 2e10 1e6 2e5 8e6 2e6
sparsegpt 28.92 21.35 18.88 16.76 13.96
wanda 30.74 22.72 19.79 16.45 13.55