dmlc / MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia
371 stars 70 forks source link

num_filters in UpSampling #108

Open kasiabozek opened 8 years ago

kasiabozek commented 8 years ago

According to the description linear upsampling is not supposed to depend on the number of filters. However the default value results in a segfault and setting a number not equal to the number of filters of the input produces a cryptic error:

INFO: Start training...
[14:05:57] /home/k/mxnet/dmlc-core/include/dmlc/logging.h:235: [14:05:57] /home/k/mxnet/mshadow/mshadow/./tensor_blob.h:647: Check failed: (this->shape_.Size()) == (shape.Size()) TBlob.get_with_shape: new and old shape do not match total elements

An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
pluskid commented 8 years ago

Hi @kasiabozek can you provide more details (e.g. a short snippet that we can run to reproduce this error)?

kasiabozek commented 8 years ago

Here is the code that produces the error:

net = data
  interim = mx.SymbolicNode[]

  for i in 1:nlayers
    conv1 = create_conv(net, nfilters)
    conv2 = create_conv(conv1, nfilters)
    pool = mx.Pooling(data=conv2, kernel=pool_kernel, stride=pool_stride, pool_type=pool_type)

    net = pool
    nfilters *= 2
    push!(interim, conv2)
  end

  net = create_conv(net, nfilters)
  net = create_conv(net, nfilters)

  for i in 1:nlayers
    nfilters = div(nfilters, 2);
    upsampling = mx.UpSampling(net, scale=2, num_filter=nfilters,
                               sample_type="bilinear",
                               workspace=WORKSPACE)
    conv1 = create_conv(mx.Concat(interim[end-i+1], upsampling), nfilters)
    conv2 = create_conv(conv1, nfilters)
    net = conv2
  end

As you can see the number of filters that I feed to the upsampling layer is not equal to the number of filters that are in this layers input. If I do the division

    nfilters = div(nfilters, 2);

after the upsampling layer in the loop then the network can be trained with no error.

Is that a correct behavior? If it is, I suggest to add a note in the layer description about the required number of filters.

pluskid commented 8 years ago

@antinucleon Looking at the code for upsampling, I think the doc here should be updated to say that num_filters is only used by bilinear upsampling, instead of nearest neighbor upsampling.

Also, the code for bilinear upsampling, this line looks a bit strange. Why the number of group is set to be the number of filters? Is that intended? Or is that the reason causing the issue that @kasiabozek showed above? (The same code is in the cuda version).

vchuravy commented 8 years ago

@kasiabozek Can you take a look if https://github.com/oist/mxnet/tree/vc/upsampling fixes your problem?

kasiabozek commented 8 years ago

It produces the same behavior. My thinking was why shouldn't num_filter be inferred from the in_data for the upsampling operation? Since it needs to have a matching size. I'm guessing this way upsampling is a bit different in function than deconvolution?

xiuliren commented 7 years ago

@kasiabozek at least, you could control the upsampling scale in each axis with deconvolution. And you can have larger field of view with deconvolution using a large kernel.