Support for Compresso pruned weights removal

microsoft / Moonlit

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

MIT License

73 stars 7 forks source link

Support for Compresso pruned weights removal #45

Open Tyler-Durden-official opened 10 months ago

Tyler-Durden-official commented 10 months ago

currently after merging pruning masks and LoRA weights, LLaMA-7B size is increasing from 15GB to 26GB. Please provide support to remove pruned weights from the model