locuslab / wanda

A simple and effective LLM pruning approach.
https://arxiv.org/abs/2306.11695
MIT License
675 stars 92 forks source link

Why did I use llama2-7B when pruningthe model to the same size as the original model #65

Open ChengShuting opened 3 months ago

ChengShuting commented 3 months ago

Why did I use llama2-7B when pruningthe model to the same size as the original model

yaya-sy commented 1 month ago

I think pruning does not give any memory gain. However, you may observe some speedup if your model is pruned with a semi-structured pruning method.