gstoica27 / ZipIt

A framework for merging models solving different tasks with different initializations into one multi-task model without any additional training
MIT License
277 stars 24 forks source link

Simple MLP graph #14

Open AvivSham opened 1 year ago

AvivSham commented 1 year ago

Hi @gstoica27, How are you? Can you please provide a graph example for stacked linear layers (i.e. MLP) with activation functions? Graph sketch: Linear->ReLU->Linear->ReLU->Linear->ReLU (3 linear layers as an example).

Cheers,

AvivSham commented 1 year ago

@dbolya Maybe you can help with this, please?

gstoica27 commented 1 year ago

Hey,

The graph would look something like this:

(Node 0, INPUT) -> (Node 1, Linear) -> (Node 2, ReLU) -> (Node 3, POSTFIX) -> (Node 4, Linear) -> (Node 5, ReLU) -> (Node6, POSTFIX) -> (Node 6, Linear) -> (Node 7, ReLU) -> OUTPUT

There may also optionally be a POSTFIX after node 7 depending on if you place a classification/head layer after the last ReLU activation.

However to get a better sense of the graph structure, I would strongly encourage you to run the already made implemented graphs in the "graphs" directory of ResNet, SinGAN, and VGG (most like the network you're describing) models. You can run the files by following the instructions in the readme and easily generate graph visualizations to then inspect.

Thanks!

gstoica27 commented 1 year ago

Also please see here: https://github.com/gstoica27/ZipIt/issues/6 (bottom) for a visualization of the resnet graph from running graphs.resnet_graph.py file :) - Hope this helps!

AvivSham commented 1 year ago

Hi @gstoica27, Thank you for responding! I think you misunderstood me, I didn't mean to graphic graph, I'm looking for a code snip that constructs BIGGraph suitable for MLP architecture.

I really appreciate any help you can provide.

gstoica27 commented 1 year ago

Hi Aviv,

If you comment out line 27, you can actually already use the VGG graph code to construct an MLP. To construct an MLP as you've described you can just set the architecture parameter to be [Linear1Dim, Linear2Dim Linear3Dim] (Assuming your MLP is an nn.Sequential assigned to a variable called "self.features").

To be clear though, all BIGGraph does is encode the computational flow of an actual model architecture you would place under "models/*". models/ and graphs/ go in tandem with one another, as described in the README. So to do what you're asking you will have to create both a graph and a model architecture.

Here is an example (pseudocode):

# Model
#imports....
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.features = nn.Sequential(
            nn.Linear(10, 32), 
            nn.ReLU(),
            nn.Linear(32,32), 
            nn.ReLU(), 
            nn.Linear(32, 32), 
            nn.ReLU()
        )
        self.classifier = nn.Linear(32, 10) # 10 output classes
    def forward(self, x):
        return self.classifier(self.features(x))

# Graph
#imports....

class MLPGraph(BIGGraph):

    def __init__(self, model):
        super().__init__(model)

    def graphify(self):
        input_node = self.create_node(node_type=NodeType.INPUT)
        node_insert = NodeType.PREFIX
        graph = [f'features.{graph_idx}' for graph_idx in range(6)] + ['classifier', NodeType.OUTPUT]
        self.add_nodes_from_sequence('', graph, input_node, sep='')
        return self

Please see the repo, graphs and models READMEs for more information :).

AvivSham commented 1 year ago

Now it is clear! @gstoica27 Thank you very much! I think you missed something in the code above node_insert is unused, shouldn't it be attached to every feature.{graph_idx}? One final question, suppose I have two trained networks both having the architecture you wrote in the code snip above. Which script / function do I need to run to align them?

Thanks in advance.