IDSIA / brainstorm

Fast, flexible and fun neural networks.
Other
1.3k stars 152 forks source link

CUDA convolutional/pooling layers #25

Closed flukeskywalker closed 9 years ago

flukeskywalker commented 9 years ago

Addition of convolution/pooling layers will significantly add capabilities to the library. The base has been designed keeping this in mind, so having these is a pre-alpha goal.

flukeskywalker commented 9 years ago

@untom, you mentioned that you were interested in working on this. Are there major hurdles which might prevent development on this line?

untom commented 9 years ago

I was mainly hoping you'd implement the convolutional backward path on the CPU first :P On the GPU-side, forward- and backward path are already implemented in the feature/gpu_conv branch... I just have nothing to test the gpu backward path against ;)

flukeskywalker commented 9 years ago

Ah okay. I'll get on finishing that soon. BTW, if we use cuDNN for conv/pooling etc., shouldn't the testing burden be on cuDNN? Essentially, shouldn't we 'assume' it works? We should still do some integration tests though.

flukeskywalker commented 9 years ago

There is now a functional conv2d forward-backward implementation. The only supported feature currently is strides > 1. I tried to do a pure Numpy implementation mostly as an exercise, but this leads to a bunch of magic. We should probably switch to Cython implementations of conv/pool etc. on the CPU. Nevertheless, maybe this is good enough to continue work on the cuDNN wrappers?

untom commented 9 years ago

awesome! I'll get on it, then :)

Switching to Cython would be an idea. Cudarray has a Cython-Implementation of conv/pool, though they don't look too sophisticated (https://github.com/andersbll/cudarray/blob/master/cudarray/numpy_backend/nnet/conv_bc01.pyx)

untom commented 9 years ago

A few naming issues:

  1. The Convolution-Layer has a kernel_size. For Pooling, window_size seems more appropriate, but maybe using the same name for both layers makes sense?
  2. The padding-option is currently called pad, but I find padding a bit more natural (but don't really care either way TBH). Any preferences on how we name it?
  3. in brainstorm.layers.__init__, should we set default strides (e.g. (1, 1)) and padding (e.g. 1) for Conv/Pool operations?
  4. what should the default activation function for the Conv Layer be? For FC it's currently "linear". Wouldn't ReLU be a more adequate default?
untom commented 9 years ago

To add: should it be ConvolutionLayer2D or Convolution2DLayer? The latter makes more sense to me, but looks more ugly.... (Same for PoolingLayer2D / Pooling2DLayer)

untom commented 9 years ago

Disregard that last one, I forgot we dropped the "Layer" suffix anyhow :D

Qwlouse commented 9 years ago
  1. I'd stick to the same name for both. But either one is ok with me
  2. slight preference towards padding
  3. seems convenient to me, especially for strides. Is 1 for padding really a good default?
  4. I agree it should rather be ReLU for both
untom commented 9 years ago

well, for convolutions, the amount of padding typically depends on the size of the kernel (as far as I know).

untom commented 9 years ago

The functionality has been merged and seems to work nicely, however there's still some stuff left to do:

  1. There should be a lot of overhead in all the calls we currently make by mallocing/setting up/free-ing descriptors all the time. Each of these calls goes through CFFI. AFAIK is pretty fast, but it might be even nicer to wrap this all up into one cython-call (that would also allow us to get rid of the python-cudnn-wrapper dependency). But I don't think this has high priority.
  2. I'd like to figure out how much time the alloc/free of the descriptors costs us and think of a remedy (e.g. per layer specific storage)
  3. we currently use a pre-defined convolution algorithm that uses no additional gpu-workspace. However, cudnn would allow to pick the best algorithm /workspace size for the given convolution parameters. This is again something that would need to use some layer-specific storage thing.
  4. Switch to cudnn_v3 API (which essentially just offers more ways to pick the right algorithm given the convolution parameters)
flukeskywalker commented 9 years ago

I think it's pretty great!

1/2 does not need high priority I'd say. The cost of setting up descriptors etc. will be minuscule in comparison to data handling and computation, especially for larger inputs.

3 and 4 should definitely be done. We should soon set up a large size experiment (ImageNet/PASCAL etc.) and these will be crucial to effectively utilize the compute power.

flukeskywalker commented 9 years ago

Add.: I think that a) Pooling2D doesn't need num_filters as arg and b) should also support avg. pooling, right?

untom commented 9 years ago

it doesn't need num_filters, you're right. As for avg pooling: we can either add it there or in its own layer, I don't know what's better?

flukeskywalker commented 9 years ago

Probably best to add a type or pool_type parameter to that layer itself. Preventing layer proliferation should be better for the future.

On 1 September 2015 at 21:26, untom notifications@github.com wrote:

it doesn't need num_filters, you're right. As for avg pooling: we can either add it there or in its own layer, I don't know what's better?

— Reply to this email directly or view it on GitHub https://github.com/Qwlouse/brainstorm/issues/25#issuecomment-136833867.

untom commented 9 years ago

k

flukeskywalker commented 9 years ago

Bulk of work for this is in. Thanks @untom!