Closed neko941 closed 2 months ago
@neko941 Hi, Thanks for recognizing our work!
The implementation of Last-N, Mid-N or Uniform layer selection depends on how model weights are arranged.
Our own implementation for the other layer selection methods creates a map between indices (for example, mapping 0,1,2,3,4,5 to 6,7,8,9,10,11 for last-N layer selection if you are selecting last 6 layers from a 12-layer model), and find the weight matrix in the teacher model accordingly by replacing indices in the weight's name. That part of code could be very model-specific (and ugly) thus we don't plan to include that in the repo.
Thank you for your excellent work. I've been reviewing the work in your GitHub repo, and I'm interested in the implementation details of Last-N, Mid-N, and Uniform Layer Selection mentioned in the paper. I couldn't locate those in the repository. Could you guide me where to find this, or if it’s not included, would it be possible to add it?