gstoica27 / ZipIt

A framework for merging models solving different tasks with different initializations into one multi-task model without any additional training
MIT License
286 stars 25 forks source link

Does the order of merge/unmerge application to nodes matter in scenarios? #24

Closed metric-space closed 7 months ago

metric-space commented 8 months ago

Hey there

I was looking through the code and the paper and I'm left wondering:

excluding scenarios of partial zips, is the process of applying merge/unmerge to subgraphs (subgraphs established via propagation from the activation pseudonode) something that can be done in parallel and then folded, rather than sequentially (as far as I understand) as done in the code?

i.e


-- function takes a starting node and associated graph and carves out subgraph
-- subgraph also annotates nodes both terminal otherwise with merge/unmerge operations
carve_out_subgraph :: Node -> Graph -> SubGraph
carve_out_subgraph = ...

-- just a normal list of activation nodes (PREFIX/POSTFIX)
list_of_activation_nodes  :: [Node]
list_of_activation_nodes =  ...

-- transforms here in parallel (map here is a function)
list_of_subgraphs :: [SubGraph]
list_of_subgraphs = map (\x -> carve_out_subgraph x whole_graph) list_of_activation_nodes 

-- function that overlays subgraph back to graph and applies merge/unmerge op to weight
overlay :: SubGraph -> Graph -> Graph
overlay  subgraph graph = ....

-- (A) reduces using the original graph as an accumulator
A :: Graph
A = reduce (\subgraph acc -> overlay subgraph acc) graph list_of_subgraphs   

-- (B) reduces using the original graph as an accumulator (same as above but input list is shuffled)
B :: Graph
B = reduce (\subgraph acc -> overlay subgraph acc) graph (shuffle list_of_subgraphs)   

-- are A and B equal?