Feedforward, convolutions, shape inferences

dwf commented 8 years ago

It seems like we should

extend the Feedforward interface to include shape tuples like convolutions
allow for programmatically determining whether a Feedforward Brick can compute its own output_dim or whether it requires specification (related to #1049)
unify the annoying input_dim / output_dim vs get_dim() which is extra cognitive load and doesn't simplify anything
unify ConvolutionalSequence with a new Sequence object that can do shape inferences through mixed stacks of spatial bricks and non-spatial bricks.

@ddtm raised the point that Torch does the last one, and I don't see any reason we shouldn't be able to achieve it either, if we become a bit more disciplined about our protocol surrounding shapes.

"Flat is better than nested", etc. etc.

@rizar Your thoughts would be appreciated. @bartvm yours too since you have experience with Torch.

rizar commented 8 years ago

Regarding input_dim and output_dim: not all the bricks that people work with are feedforward. Only feedforward bricks have those properties. For other bricks, get_dim makes a lot of sense.

I fully understand that being able to put Feedforward bricks in ConvolutionalSequence would make most convnet scripts a bit shorter. We should about how we can do it, put preferably without breaking the rest of the library :)

dwf commented 8 years ago

Yes, I'm aware. Convolutional bricks are an instance where input_dim and output_dim aliases make sense, though (and in fact input_dim is already an allocation argument!).

I agree we shouldn't, and I don't think we have to, break anything in particular to yield an object (probably a new one) that has the desirable shape inference properties.

mila-iqia / blocks

Feedforward, convolutions, shape inferences #1053