princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
https://arxiv.org/abs/2310.06694
MIT License
533 stars 39 forks source link

Pruning fine-tuned model #55

Closed kiucho closed 6 months ago

kiucho commented 6 months ago

Thanks for sharing your great research.

I was wondering if pruning could be applied to an already fine-tuned model, so I'd like to ask for your advice. If so, would it be possible to have continued pre-training?

xiamengzhou commented 6 months ago

I think it should work for pruning fine-tuned models! The key is to add back the fine-tuning data at some point, either at the first stage (pruning) or second stage (continued pre-training).

kiucho commented 6 months ago

Thanks for reply! I'll try to make some progress.