gstoica27 / ZipIt

A framework for merging models solving different tasks with different initializations into one multi-task model without any additional training
MIT License
286 stars 25 forks source link

[Question] ConvNeXt layers in MergeHandler #10

Closed niklasnolte closed 1 year ago

niklasnolte commented 1 year ago

I am trying to implement a graph for convnext, which has a custom module: LayerNorm2d. In there, the input is permuted before sent through LayerNorm. The mergehandler can use handle_layernorm for that right? no need to implement another function if I understand it right. That ties in with the question, if a torch.nn.Permute needs anything else but a "handle_fn" handler

gstoica27 commented 1 year ago

Hi,

  1. Yes the LayerNorm2d handler should already contain everything you need.
  2. For Permute, I believe it should look very similar to the handle_fn logic, except that you will further have to permute the dimensions of the merge/unmerge according to permutation defined in the module.
niklasnolte commented 1 year ago
  1. you mean LayerNorm handler, right? there is no layernorm2d handler.
  2. okay, i shall try that.

Thanks

gstoica27 commented 1 year ago

For 1 - yes sorry. And 2 sounds good. Also, I just replied to your email as well about meeting :) Please let me know if I can help with anything else!