Neural Networks as Sub-Components of a Neural Network ( in other words: Mixed Layers )

StephenAshmore commented 7 years ago

With issue #31 being worked on right now, we have the question of how to implement mixed layers with our new setup. @thelukester92 and I were discussing this, and here is his solution which I support.

We could implement neural networks inside of a single neural network, essentially thinking of them as blocks or components. Here is a rough drawing of how the components would look in a neural network.

neural networks mixed layer

Component 1 is just a fully connected several layers. Component 2 is a shortcut connection similar to #19. The inputs are fully connected to the inputs of both of the components, and then the outputs of the two components are then fully connected to the next layer after the components. This will enable us to do things like LSTM gates, or residual blocks, or shortcut connections relatively easily. We can make them as complex or as simple as we'd like, because we only need to know whether to feed forward through layers, or feed forward through a network.

mikegashler commented 7 years ago

I have been thinking about how to implement this. Here is a candidate implementation plan:

Rename GNeuralNetLayer to GUnits.
Rename all of the GLayerXXX classes to GUnitsXXX.
Add a new GLayer class that will concatenate any number of GUnits into a single layer.
Modify the GNeuralNet constructor to require the user to specify the number of layers. Layers are initially empty. The user will concatenate units to the layers. So, the interface might look something like this:

GNeuralNet nn(4); // Make a 4-layer neural net nn.concat(0, new GUnitsLinear(100)); nn.concat(0, new GUnitsConvolutional2d(32, 32, 8)); nn.concat(1, new GUnitsTanh()); nn.concat(2, new GLinear(15)); nn.concat(3, new GUnitsTanh()); nn.train(features, labels); // throws if any layers are still empty

Move the "blame" and "activation" vectors into the GLayer class. Units will reference their assigned portions of these vectors.
Rename the "resize" method in the units classes to "init", and modify it to hook up the "blame" and "activation" vector references.
When we call nn.beginIncrementalLearning, that will run through all the layers and init all the units.

Finally, to complete the feature...

Add a new "GComponent" class that inherits from "GUnits" and contains a collection of layers. Since GComponent is itself a GUnit, users can concatenate entire components to layers, as illustrated in Stephen's diagram above.
At this point, a GNeuralNet and a GComponent are pretty-much the same thing, so we might as well see if we can fuse them together somehow. Perhaps, each GNeuralNet should just contain a single GComponent.

mikegashler commented 7 years ago

Okay, I did it. This turned out to require a lot more code restructuring than I initially anticipated, so everything that uses neural networks in Waffles will now need to be repaired to accommodate the new interface. The new interface is more cumbersome than before, but it is a lot more flexible. See waffles/web/doc/neuralnet.html for details about how to use it.

mikegashler / waffles

Neural Networks as Sub-Components of a Neural Network ( in other words: Mixed Layers ) #32