IDSIA / brainstorm

Fast, flexible and fun neural networks.
Other
1.3k stars 152 forks source link

Regression Discrepancy #65

Closed jramapuram closed 8 years ago

jramapuram commented 8 years ago

It looks like regression is looking for inputs names inputs_1 and inputs_2 instead of target and default.

Also, I can't seem to get this to work even with the named changes. I get an error as such:

brainstorm.utils.LayerValidationError: Invalid in_shapes. SquaredDifference has no input(s) named "set([u'default', u'targets'])". Choices are: set([u'inputs_2', u'inputs_1'])

Sizing is: Input shape: (1411889, 1, 1) | Target shape: (1411889, 1, 1)

batch_size = 1
x_tr = x_tr[:, np.newaxis]
y_tr = np.roll(x_tr, -1, axis=0)

print("Input shape: %s | Target shape: %s" % (x_tr.shape, y_tr.shape))
getter_tr = Minibatches(batch_size, shuffle=False)

# ----------------------------- Set up Network ------------------------------ #

network = bs.tools.create_net_from_spec(task_type='regression',
                                        in_shape=batch_size,
                                        out_shape=batch_size,
                                        spec='L1 L512 D0.5 L256 D0.5 L512 L1') #also tried F1 for last layer
#                                        data_name='inputs_1',
#                                        targets_name='inputs_2')
network.set_handler(PyCudaHandler())
network.initialize(bs.initializers.Gaussian(0.01))

# ----------------------------- Set up Trainer ------------------------------ #

trainer = bs.Trainer(bs.training.MomentumStepper(learning_rate=0.01,
                                                 momentum=0.9),
                     verbose=True)
trainer.add_hook(bs.hooks.ProgressBar())
#out_name='Output_Conv.outputs.default')]
scorers = [bs.scorers.MeanSquaredError()] #targets_name='inputs_2'
flukeskywalker commented 8 years ago

You're right, it was an issue about input view names.

Since it is a general Sq. Diff. layer (it computes gradients WRT both inputs), it is not using the default/target names used by layers like SoftmaxCE. We plan to address this soon by introducing a similar layer just for Regression which will be more intuitive to use, and won't compute the extra gradient.

For now I've fixed the tool, so it should work.

jramapuram commented 8 years ago

Great, thanks for the quick turn around. Looks like it is working now.

adampl commented 8 years ago

@jramapuram Could you post a simple regression example where the network is used to predict something?

sjoerdvansteenkiste commented 8 years ago

@flukeskywalker The pip version still suffers from this bug btw. Since it is pretty decremental it should perhaps be forwarded there?

flukeskywalker commented 8 years ago

Thanks @sjoerdvansteenkiste. We will do a new release tomorrow to fix this situation.