mikegashler / waffles

A toolkit of machine learning algorithms.
http://gashler.com/mike/waffles/
86 stars 33 forks source link

Neural Networks as Sub-Components of a Neural Network ( in other words: Mixed Layers ) #32

Closed StephenAshmore closed 7 years ago

StephenAshmore commented 7 years ago

With issue #31 being worked on right now, we have the question of how to implement mixed layers with our new setup. @thelukester92 and I were discussing this, and here is his solution which I support.

We could implement neural networks inside of a single neural network, essentially thinking of them as blocks or components. Here is a rough drawing of how the components would look in a neural network.

neural networks mixed layer

Component 1 is just a fully connected several layers. Component 2 is a shortcut connection similar to #19. The inputs are fully connected to the inputs of both of the components, and then the outputs of the two components are then fully connected to the next layer after the components. This will enable us to do things like LSTM gates, or residual blocks, or shortcut connections relatively easily. We can make them as complex or as simple as we'd like, because we only need to know whether to feed forward through layers, or feed forward through a network.

mikegashler commented 7 years ago

I have been thinking about how to implement this. Here is a candidate implementation plan:

GNeuralNet nn(4); // Make a 4-layer neural net nn.concat(0, new GUnitsLinear(100)); nn.concat(0, new GUnitsConvolutional2d(32, 32, 8)); nn.concat(1, new GUnitsTanh()); nn.concat(2, new GLinear(15)); nn.concat(3, new GUnitsTanh()); nn.train(features, labels); // throws if any layers are still empty

Finally, to complete the feature...

mikegashler commented 7 years ago

Okay, I did it. This turned out to require a lot more code restructuring than I initially anticipated, so everything that uses neural networks in Waffles will now need to be repaired to accommodate the new interface. The new interface is more cumbersome than before, but it is a lot more flexible. See waffles/web/doc/neuralnet.html for details about how to use it.