Why did I use llama2-7B when pruningthe model to the same size as the original model

locuslab / wanda

A simple and effective LLM pruning approach.

https://arxiv.org/abs/2306.11695

MIT License

675 stars 92 forks source link

Open ChengShuting opened 3 months ago

ChengShuting commented 3 months ago

Why did I use llama2-7B when pruningthe model to the same size as the original model

yaya-sy commented 1 month ago

I think pruning does not give any memory gain. However, you may observe some speedup if your model is pruned with a semi-structured pruning method.