SMART-Lab / smartmodels

Repository containing all different models we developed in the SMART.
0 stars 2 forks source link

Build a ModelNetwork by putting togheter Layer objects #10

Closed MarcCote closed 9 years ago

MarcCote commented 9 years ago

I still think we should have Layer objects that we could manipulate to form the model network.

This is what I have in mind for a simple feed forward neural network with two hidden layers.

input_layer = Layer(size=trainset.input_size)
first_hidden_layer = FullyConnectedLayer(size=100, activation_fct=SigmoidActivation)
second_hidden_layer = FullyConnectedLayer(size=50, activation_fct=TanhActivation)
output_layer = FullyConnectedLayer(size=trainset.target_size, activation_fct=SigmoidActivation)

network = input_layer + first_hidden_layer + second_hidden_layer + output_layer
model = NN(network, ...)

This is an example for an RNN with two inputs.

input_layer = Layer(size=trainset.input_size)
first_hidden_layer = RecurrentLayer(size=100, activation_fct=SigmoidActivation)
second_hidden_layer = RecurrentLayer(size=50, activation_fct=TanhActivation)
another_hidden_layer = RecurrentLayer(size=50, activation_fct=ReLuActivation)
output_layer = FullyConnectedLayer(size=trainset.target_size, activation_fct=SigmoidActivation)

network1 = input_layer + first_hidden_layer + second_hidden_layer
network2 = input_layer + another_hidden_layer
merged_networks = AggregateNetworks(network1, network2, method="sum")
network = merged_networks + output_layer
model = RNN(network, ...)

network.parameters should provide a list of the parameters of every layers.

More thinking is needed to support dropout and batch normalization, maybe some sort of decorator design pattern would be useful for that.

A layer has the responsibility to ask for the outputs of the previous layer(s) it is connected to, weights them and applied their activation function (i.e. the the fprop of this layer).

More thinking is needed for models that have multiple outputs. @havaeimo I would love to have your input for this.

ASalvail commented 9 years ago

Alright, for the fun of it :

#abstract object
Layer(object):
def __init__(size: int, name: str, parent_list: dict[Layer, WeightInitializer], bias_initer: WeightInitializer, activation_function: str):
  ...

FullyConnectedLayer(Layer):
def __init__(...):
  super().__init__(,,,)
  self.edges_parameters = dict()
  for layer, initer in self.parents_dict:
    self.edges_parameters[layer] = shared(initer((layer.size, self.size)))
    self.bias = shared(bias_initer(self.size))

@property
def graph(self):
  if self._graph is None:
    activation = []
    for parents, weight in self.edges_parameters:
      activation.append(T.dot(weights, parents.generate_graph()))
    activation.append(self.bias)
    self._graph = self.activation_function(sum(activation))
  return self._graph

I'm not familiar enough with scan to propose something sound, yet.

ASalvail commented 9 years ago

Changed with the introduction of blocks. See PR #15 .