The layers in brainstorm fulfill a contract, by (non-exhaustive list):
calculating their input_deltas for all given output_deltas
calculating the gradients for all parameters
always add to the input_deltas and to the gradients (not overwriting them)
Except that some layers don't. And they have good reasons. E.g:
the SoftmaxCE layer and the BinomialCrossEntropy layer don't compute any deltas for their targets input
the SoftmaxCE layer doesn't use the deltas coming in from its probabilities output.
the Mask layer doesn't compute any deltas for its mask input
the ClockworkRnn layer doesn't compute the gradients for its timing parameter
So right now what we do is: We document that behaviour (sometimes) and in the tests we say which things not to test. Because otherwise the automated layer tests would fail for these layers.
But I think these contract violations are important, and in future versions we might want to warn the user about architectures that won't work as expected because of them. So I suggest we add the contract violations as attributes to the layer class. Something like this:
This would serve as documentation as well as for automatic checking. We could use that information in the tests and make sure contract violations are specified like that.
The layers in brainstorm fulfill a contract, by (non-exhaustive list):
input_deltas
for all givenoutput_deltas
gradients
for allparameters
input_deltas
and to thegradients
(not overwriting them)Except that some layers don't. And they have good reasons. E.g:
SoftmaxCE
layer and theBinomialCrossEntropy
layer don't compute any deltas for theirtargets
inputSoftmaxCE
layer doesn't use the deltas coming in from itsprobabilities
output.Mask
layer doesn't compute any deltas for itsmask
inputClockworkRnn
layer doesn't compute the gradients for itstiming
parameterSo right now what we do is: We document that behaviour (sometimes) and in the tests we say which things not to test. Because otherwise the automated layer tests would fail for these layers.
But I think these contract violations are important, and in future versions we might want to warn the user about architectures that won't work as expected because of them. So I suggest we add the contract violations as attributes to the layer class. Something like this:
(The names are open for debate)
This would serve as documentation as well as for automatic checking. We could use that information in the tests and make sure contract violations are specified like that.
What do you guys think about that?