Open ChengShuting opened 3 months ago
Why did I use llama2-7B when pruningthe model to the same size as the original model
I think pruning does not give any memory gain. However, you may observe some speedup if your model is pruned with a semi-structured pruning method.
Why did I use llama2-7B when pruningthe model to the same size as the original model