Layer contract violations as class attributes

The layers in brainstorm fulfill a contract, by (non-exhaustive list):

calculating their input_deltas for all given output_deltas
calculating the gradients for all parameters
always add to the input_deltas and to the gradients (not overwriting them)

Except that some layers don't. And they have good reasons. E.g:

the SoftmaxCE layer and the BinomialCrossEntropy layer don't compute any deltas for their targets input
the SoftmaxCE layer doesn't use the deltas coming in from its probabilities output.
the Mask layer doesn't compute any deltas for its mask input
the ClockworkRnn layer doesn't compute the gradients for its timing parameter

So right now what we do is: We document that behaviour (sometimes) and in the tests we say which things not to test. Because otherwise the automated layer tests would fail for these layers.

But I think these contract violations are important, and in future versions we might want to warn the user about architectures that won't work as expected because of them. So I suggest we add the contract violations as attributes to the layer class. Something like this:

class SoftmaxCELayerImpl(Layer):
    expected_inputs = {'default': StructureTemplate('T', 'B', '...'),
                       'targets': StructureTemplate('T', 'B', '...')}
    expected_kwargs = {}

    computes_no_input_deltas_for = ['targets']
    # computes_no_gradients_for = []
    takes_no_output_deltas_for = ['probabilities']

    ...

(The names are open for debate)

This would serve as documentation as well as for automatic checking. We could use that information in the tests and make sure contract violations are specified like that.

What do you guys think about that?

IDSIA / brainstorm

Layer contract violations as class attributes #55