BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

Any input produce the same output #1396

Closed mender05 closed 9 years ago

mender05 commented 9 years ago

I try to ues caffe to implement the DeepPose proposed in this paper: http://arxiv.org/abs/1312.4659 DeepPose has 3 stages. And each stage is almost the same as AlexNet (DeepPose changes the loss layer in AlexNet to euclidean loss). It is a regression problem in fact.

The train.prototxt is:

name: "CaffeNet"
layers {
  name: "image"
  type: DATA
  top: "image"
  data_param {
    source: "examples/lsp/lsp_train_images_lmdb"
    backend: LMDB
    batch_size: 30
    scale: 0.00390625
  }
}
layers {
  name: "label"
  type: DATA
  top: "label"
  data_param {
    source: "examples/lsp/lsp_train_labels_lmdb"
    backend: LMDB
    batch_size: 30
    scale: 0.00454545
  }
}
layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "image"
  top: "conv1"
...  THIS IS THE SAME AS ALEXNET ...
layers {
  name: "fc8"
  type: INNER_PRODUCT
  bottom: "fc7"
  top: "fc8"
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  inner_product_param {
    num_output: 28
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  name: "loss"
  type: EUCLIDEAN_LOSS
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

The solve.prototxt is:

net: "models/lsp/deeppose_train.prototxt"
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 7500
display: 50
max_iter: 36500
momentum: 0.9
weight_decay: 0.0000005
snapshot: 2000
snapshot_prefix: "models/lsp/caffenet_train"
solver_mode: GPU

After trainning completed, I use python interface to do prediction on testset. The test.prototxt is:

name: "CaffeNet"
layers {
  name: "image"
  type: MEMORY_DATA
  top: "image"
    top: "useless"
  memory_data_param {
    batch_size: 30
    channels: 3
    height: 220
    width: 220
  }
}
layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "image"
... 
layers {
  name: "fc8"
  type: INNER_PRODUCT
  bottom: "fc7"
  top: "fc8"
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  inner_product_param {
    num_output: 28
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

but the output is very strange. Dumpping the output of "fc8" layer, I find that all the images produce the same output:

array([[ 0.48381898,  0.02326088,  0.02317634,  0.02317682,  0.48248914,
         0.01622555,  0.0161516 ,  0.01615119,  0.48646507,  0.03201264,
         0.03185751,  0.03185739,  0.52191395,  0.03508802,  0.03494693,
         0.03494673,  0.52380753,  0.01708153,  0.01701014,  0.01700996,
         0.52726734,  0.02286946,  0.02277863,  0.0227785 ,  0.46513146,
         0.02239206,  0.02227863,  0.02227836],
       [ 0.48381898,  0.02326088,  0.02317634,  0.02317682,  0.48248914,
         0.01622555,  0.0161516 ,  0.01615119,  0.48646507,  0.03201264,
         0.03185751,  0.03185739,  0.52191395,  0.03508802,  0.03494693,
         0.03494673,  0.52380753,  0.01708153,  0.01701014,  0.01700996,
         0.52726734,  0.02286946,  0.02277863,  0.0227785 ,  0.46513146,
         0.02239206,  0.02227863,  0.02227836],
       [ 0.48381898,  0.02326088,  0.02317634,  0.02317682,  0.48248914,
         0.01622555,  0.0161516 ,  0.01615119,  0.48646507,  0.03201264,
         0.03185751,  0.03185739,  0.52191395,  0.03508802,  0.03494693,
         0.03494673,  0.52380753,  0.01708153,  0.01701014,  0.01700996,
         0.52726734,  0.02286946,  0.02277863,  0.0227785 ,  0.46513146,
         0.02239206,  0.02227863,  0.02227836],

In fact, no mater what the inputs are, the outputs are always the same with the values above. How the problem caused?

YangForever commented 7 years ago

Hi, have you solved this problem? I also follow to use AlexNet to compute the coordinates of the body joints and get the same results. However, when I use a simple Network like LeNet, it performs better and can produce reasonable results. So, I guess AlexNet is too deep, but I still do not know why the paper can achieve this with AlexNet.

Manchery commented 6 years ago

I got the same problem while I was implementing the DeepPose. Then I read every comment and found them all really helpful. I've changed my model deploy for many times according to many of these nice advice and finally the problem disappeared. Because of so many changes, I can't say which particular operation is the key and make things work. But I really suggest you, my friends who still suffer from the problem, try these advice no matter they work or not. Thank you all !