whittle-org / whittle

Python library to compress LitGPT models for resource efficient inference.
https://whittle-org.github.io/whittle/latest/
Apache License 2.0
10 stars 4 forks source link

Support initializing a smaller dense model from weights of a larger model #177

Open rheasukthanker opened 14 hours ago

rheasukthanker commented 14 hours ago

Is your feature request related to a problem? Please describe. Currently while we can extract a subnetwork from a supernetwork, given two models a larger one and a smaller one, we cannot initialize the smaller model from weights of a larger model ie. copy. Check this test utils function for a simple example https://github.com/whittle-org/whittle/blob/87de1d0e6d1d4a31c1a267eb1cb10951ce9eb4f7/test/test_api.py#L21. If would be nice to support this functionality.

Describe the solution you'd like A simple function (perhaps using parts of extract_sub_network) which makes this possible for any two input models. Please add a test for pythia and llama-3.2 extraction from llama-3.1.

Additional context This is very useful if one wants to simply extract a network and use it for knowledge distillation.