OscarXZQ / weight-selection

164 stars 11 forks source link

Layer Selection Implementation #4

Closed neko941 closed 2 months ago

neko941 commented 4 months ago

Thank you for your excellent work. I've been reviewing the work in your GitHub repo, and I'm interested in the implementation details of Last-N, Mid-N, and Uniform Layer Selection mentioned in the paper. I couldn't locate those in the repository. Could you guide me where to find this, or if it’s not included, would it be possible to add it?

OscarXZQ commented 4 months ago

@neko941 Hi, Thanks for recognizing our work!

The implementation of Last-N, Mid-N or Uniform layer selection depends on how model weights are arranged.

Our own implementation for the other layer selection methods creates a map between indices (for example, mapping 0,1,2,3,4,5 to 6,7,8,9,10,11 for last-N layer selection if you are selecting last 6 layers from a 12-layer model), and find the weight matrix in the teacher model accordingly by replacing indices in the weight's name. That part of code could be very model-specific (and ugly) thus we don't plan to include that in the repo.