microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.79k stars 964 forks source link

Converted Resent18 from mxnet is predicting label incorrectly #152

Closed prashant-puri closed 6 years ago

prashant-puri commented 6 years ago

Ubuntu - 14.04

Python version: 2.7

Caffe : 1.0.0 (CPU)

MMDNN Path : http://data.dmlc.ml/mxnet/models/imagenet/resnet/18-layers/resnet-18-symbol.json http://data.dmlc.ml/mxnet/models/imagenet/resnet/18-layers/resnet-18-0000.params

I have successfully converted mxnet to caffe Model resnet-18.prototxt

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 224
      dim: 224
    }
  }
}
layer {
  name: "bn_data"
  type: "BatchNorm"
  bottom: "data"
  top: "bn_data"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "bn_data_scale"
  type: "Scale"
  bottom: "bn_data"
  top: "bn_data"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "conv0"
  type: "Convolution"
  bottom: "bn_data"
  top: "conv0"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 7
    group: 1
    stride: 2
    pad_h: 3
    pad_w: 3
  }
}
layer {
  name: "bn0"
  type: "BatchNorm"
  bottom: "conv0"
  top: "bn0"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "bn0_scale"
  type: "Scale"
  bottom: "bn0"
  top: "bn0"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "relu0"
  type: "ReLU"
  bottom: "bn0"
  top: "bn0"
}
layer {
  name: "pooling0"
  type: "Pooling"
  bottom: "bn0"
  top: "pooling0"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage1_unit1_bn1"
  type: "BatchNorm"
  bottom: "pooling0"
  top: "stage1_unit1_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage1_unit1_bn1_scale"
  type: "Scale"
  bottom: "stage1_unit1_bn1"
  top: "stage1_unit1_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage1_unit1_relu1"
  type: "ReLU"
  bottom: "stage1_unit1_bn1"
  top: "stage1_unit1_bn1"
}
layer {
  name: "stage1_unit1_conv1"
  type: "Convolution"
  bottom: "stage1_unit1_bn1"
  top: "stage1_unit1_conv1"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage1_unit1_sc"
  type: "Convolution"
  bottom: "stage1_unit1_bn1"
  top: "stage1_unit1_sc"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 1
    group: 1
    stride: 1
    pad_h: 0
    pad_w: 0
  }
}
layer {
  name: "stage1_unit1_bn2"
  type: "BatchNorm"
  bottom: "stage1_unit1_conv1"
  top: "stage1_unit1_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage1_unit1_bn2_scale"
  type: "Scale"
  bottom: "stage1_unit1_bn2"
  top: "stage1_unit1_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage1_unit1_relu2"
  type: "ReLU"
  bottom: "stage1_unit1_bn2"
  top: "stage1_unit1_bn2"
}
layer {
  name: "stage1_unit1_conv2"
  type: "Convolution"
  bottom: "stage1_unit1_bn2"
  top: "stage1_unit1_conv2"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus0"
  type: "Eltwise"
  bottom: "stage1_unit1_conv2"
  bottom: "stage1_unit1_sc"
  top: "plus0"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage1_unit2_bn1"
  type: "BatchNorm"
  bottom: "plus0"
  top: "stage1_unit2_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage1_unit2_bn1_scale"
  type: "Scale"
  bottom: "stage1_unit2_bn1"
  top: "stage1_unit2_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage1_unit2_relu1"
  type: "ReLU"
  bottom: "stage1_unit2_bn1"
  top: "stage1_unit2_bn1"
}
layer {
  name: "stage1_unit2_conv1"
  type: "Convolution"
  bottom: "stage1_unit2_bn1"
  top: "stage1_unit2_conv1"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage1_unit2_bn2"
  type: "BatchNorm"
  bottom: "stage1_unit2_conv1"
  top: "stage1_unit2_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage1_unit2_bn2_scale"
  type: "Scale"
  bottom: "stage1_unit2_bn2"
  top: "stage1_unit2_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage1_unit2_relu2"
  type: "ReLU"
  bottom: "stage1_unit2_bn2"
  top: "stage1_unit2_bn2"
}
layer {
  name: "stage1_unit2_conv2"
  type: "Convolution"
  bottom: "stage1_unit2_bn2"
  top: "stage1_unit2_conv2"
  convolution_param {
    num_output: 64
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus1"
  type: "Eltwise"
  bottom: "stage1_unit2_conv2"
  bottom: "plus0"
  top: "plus1"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage2_unit1_bn1"
  type: "BatchNorm"
  bottom: "plus1"
  top: "stage2_unit1_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage2_unit1_bn1_scale"
  type: "Scale"
  bottom: "stage2_unit1_bn1"
  top: "stage2_unit1_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage2_unit1_relu1"
  type: "ReLU"
  bottom: "stage2_unit1_bn1"
  top: "stage2_unit1_bn1"
}
layer {
  name: "stage2_unit1_conv1"
  type: "Convolution"
  bottom: "stage2_unit1_bn1"
  top: "stage2_unit1_conv1"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 2
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage2_unit1_sc"
  type: "Convolution"
  bottom: "stage2_unit1_bn1"
  top: "stage2_unit1_sc"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 1
    group: 1
    stride: 2
    pad_h: 0
    pad_w: 0
  }
}
layer {
  name: "stage2_unit1_bn2"
  type: "BatchNorm"
  bottom: "stage2_unit1_conv1"
  top: "stage2_unit1_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage2_unit1_bn2_scale"
  type: "Scale"
  bottom: "stage2_unit1_bn2"
  top: "stage2_unit1_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage2_unit1_relu2"
  type: "ReLU"
  bottom: "stage2_unit1_bn2"
  top: "stage2_unit1_bn2"
}
layer {
  name: "stage2_unit1_conv2"
  type: "Convolution"
  bottom: "stage2_unit1_bn2"
  top: "stage2_unit1_conv2"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus2"
  type: "Eltwise"
  bottom: "stage2_unit1_conv2"
  bottom: "stage2_unit1_sc"
  top: "plus2"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage2_unit2_bn1"
  type: "BatchNorm"
  bottom: "plus2"
  top: "stage2_unit2_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage2_unit2_bn1_scale"
  type: "Scale"
  bottom: "stage2_unit2_bn1"
  top: "stage2_unit2_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage2_unit2_relu1"
  type: "ReLU"
  bottom: "stage2_unit2_bn1"
  top: "stage2_unit2_bn1"
}
layer {
  name: "stage2_unit2_conv1"
  type: "Convolution"
  bottom: "stage2_unit2_bn1"
  top: "stage2_unit2_conv1"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage2_unit2_bn2"
  type: "BatchNorm"
  bottom: "stage2_unit2_conv1"
  top: "stage2_unit2_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage2_unit2_bn2_scale"
  type: "Scale"
  bottom: "stage2_unit2_bn2"
  top: "stage2_unit2_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage2_unit2_relu2"
  type: "ReLU"
  bottom: "stage2_unit2_bn2"
  top: "stage2_unit2_bn2"
}
layer {
  name: "stage2_unit2_conv2"
  type: "Convolution"
  bottom: "stage2_unit2_bn2"
  top: "stage2_unit2_conv2"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus3"
  type: "Eltwise"
  bottom: "stage2_unit2_conv2"
  bottom: "plus2"
  top: "plus3"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage3_unit1_bn1"
  type: "BatchNorm"
  bottom: "plus3"
  top: "stage3_unit1_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage3_unit1_bn1_scale"
  type: "Scale"
  bottom: "stage3_unit1_bn1"
  top: "stage3_unit1_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage3_unit1_relu1"
  type: "ReLU"
  bottom: "stage3_unit1_bn1"
  top: "stage3_unit1_bn1"
}
layer {
  name: "stage3_unit1_conv1"
  type: "Convolution"
  bottom: "stage3_unit1_bn1"
  top: "stage3_unit1_conv1"
  convolution_param {
    num_output: 256
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 2
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage3_unit1_sc"
  type: "Convolution"
  bottom: "stage3_unit1_bn1"
  top: "stage3_unit1_sc"
  convolution_param {
    num_output: 256
    bias_term: false
    kernel_size: 1
    group: 1
    stride: 2
    pad_h: 0
    pad_w: 0
  }
}
layer {
  name: "stage3_unit1_bn2"
  type: "BatchNorm"
  bottom: "stage3_unit1_conv1"
  top: "stage3_unit1_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage3_unit1_bn2_scale"
  type: "Scale"
  bottom: "stage3_unit1_bn2"
  top: "stage3_unit1_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage3_unit1_relu2"
  type: "ReLU"
  bottom: "stage3_unit1_bn2"
  top: "stage3_unit1_bn2"
}
layer {
  name: "stage3_unit1_conv2"
  type: "Convolution"
  bottom: "stage3_unit1_bn2"
  top: "stage3_unit1_conv2"
  convolution_param {
    num_output: 256
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus4"
  type: "Eltwise"
  bottom: "stage3_unit1_conv2"
  bottom: "stage3_unit1_sc"
  top: "plus4"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage3_unit2_bn1"
  type: "BatchNorm"
  bottom: "plus4"
  top: "stage3_unit2_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage3_unit2_bn1_scale"
  type: "Scale"
  bottom: "stage3_unit2_bn1"
  top: "stage3_unit2_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage3_unit2_relu1"
  type: "ReLU"
  bottom: "stage3_unit2_bn1"
  top: "stage3_unit2_bn1"
}
layer {
  name: "stage3_unit2_conv1"
  type: "Convolution"
  bottom: "stage3_unit2_bn1"
  top: "stage3_unit2_conv1"
  convolution_param {
    num_output: 256
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage3_unit2_bn2"
  type: "BatchNorm"
  bottom: "stage3_unit2_conv1"
  top: "stage3_unit2_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage3_unit2_bn2_scale"
  type: "Scale"
  bottom: "stage3_unit2_bn2"
  top: "stage3_unit2_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage3_unit2_relu2"
  type: "ReLU"
  bottom: "stage3_unit2_bn2"
  top: "stage3_unit2_bn2"
}
layer {
  name: "stage3_unit2_conv2"
  type: "Convolution"
  bottom: "stage3_unit2_bn2"
  top: "stage3_unit2_conv2"
  convolution_param {
    num_output: 256
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus5"
  type: "Eltwise"
  bottom: "stage3_unit2_conv2"
  bottom: "plus4"
  top: "plus5"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage4_unit1_bn1"
  type: "BatchNorm"
  bottom: "plus5"
  top: "stage4_unit1_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage4_unit1_bn1_scale"
  type: "Scale"
  bottom: "stage4_unit1_bn1"
  top: "stage4_unit1_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage4_unit1_relu1"
  type: "ReLU"
  bottom: "stage4_unit1_bn1"
  top: "stage4_unit1_bn1"
}
layer {
  name: "stage4_unit1_conv1"
  type: "Convolution"
  bottom: "stage4_unit1_bn1"
  top: "stage4_unit1_conv1"
  convolution_param {
    num_output: 512
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 2
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage4_unit1_sc"
  type: "Convolution"
  bottom: "stage4_unit1_bn1"
  top: "stage4_unit1_sc"
  convolution_param {
    num_output: 512
    bias_term: false
    kernel_size: 1
    group: 1
    stride: 2
    pad_h: 0
    pad_w: 0
  }
}
layer {
  name: "stage4_unit1_bn2"
  type: "BatchNorm"
  bottom: "stage4_unit1_conv1"
  top: "stage4_unit1_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage4_unit1_bn2_scale"
  type: "Scale"
  bottom: "stage4_unit1_bn2"
  top: "stage4_unit1_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage4_unit1_relu2"
  type: "ReLU"
  bottom: "stage4_unit1_bn2"
  top: "stage4_unit1_bn2"
}
layer {
  name: "stage4_unit1_conv2"
  type: "Convolution"
  bottom: "stage4_unit1_bn2"
  top: "stage4_unit1_conv2"
  convolution_param {
    num_output: 512
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus6"
  type: "Eltwise"
  bottom: "stage4_unit1_conv2"
  bottom: "stage4_unit1_sc"
  top: "plus6"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "stage4_unit2_bn1"
  type: "BatchNorm"
  bottom: "plus6"
  top: "stage4_unit2_bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage4_unit2_bn1_scale"
  type: "Scale"
  bottom: "stage4_unit2_bn1"
  top: "stage4_unit2_bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage4_unit2_relu1"
  type: "ReLU"
  bottom: "stage4_unit2_bn1"
  top: "stage4_unit2_bn1"
}
layer {
  name: "stage4_unit2_conv1"
  type: "Convolution"
  bottom: "stage4_unit2_bn1"
  top: "stage4_unit2_conv1"
  convolution_param {
    num_output: 512
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "stage4_unit2_bn2"
  type: "BatchNorm"
  bottom: "stage4_unit2_conv1"
  top: "stage4_unit2_bn2"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "stage4_unit2_bn2_scale"
  type: "Scale"
  bottom: "stage4_unit2_bn2"
  top: "stage4_unit2_bn2"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "stage4_unit2_relu2"
  type: "ReLU"
  bottom: "stage4_unit2_bn2"
  top: "stage4_unit2_bn2"
}
layer {
  name: "stage4_unit2_conv2"
  type: "Convolution"
  bottom: "stage4_unit2_bn2"
  top: "stage4_unit2_conv2"
  convolution_param {
    num_output: 512
    bias_term: false
    kernel_size: 3
    group: 1
    stride: 1
    pad_h: 1
    pad_w: 1
  }
}
layer {
  name: "plus7"
  type: "Eltwise"
  bottom: "stage4_unit2_conv2"
  bottom: "plus6"
  top: "plus7"
  eltwise_param {
    operation: SUM
  }
}
layer {
  name: "bn1"
  type: "BatchNorm"
  bottom: "plus7"
  top: "bn1"
  batch_norm_param {
    use_global_stats: true
    eps: 1.99999994948e-05
  }
}
layer {
  name: "bn1_scale"
  type: "Scale"
  bottom: "bn1"
  top: "bn1"
  scale_param {
    bias_term: true
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "bn1"
  top: "bn1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "bn1"
  top: "pool1"
  pooling_param {
    pool: AVE
    stride: 1
    global_pooling: true
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  inner_product_param {
    num_output: 1000
    bias_term: true
  }
}
layer {
  name: "softmax"
  type: "Softmax"
  bottom: "fc1"
  top: "softmax"
}

Now I tried to predict the image using below script

import caffe
prototxt_filename='path_to_resnet18/resnet18.prototxt'
caffemodel_filename='path_to_resnet18/resnet18.caffemodel'

net = caffe.Net(prototxt_filename,
                caffemodel_filename,
                caffe.TEST)

# load input and configure preprocessing
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', np.array([104, 117, 123]))

transformer.set_raw_scale('data', 255.0)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(1,3,224,224)

image_name = 'http://farm4.static.flickr.com/3170/2533026039_a4d72913ec.jpg'

im = caffe.io.load_image(image_name)
net.blobs['data'].data[...] = transformer.preprocess('data', im)

out = net.forward()

output_prob = out['softmax'][0]

#print predicted labels
labels = np.loadtxt(base_dir + "data/ilsvrc12/synset_words.txt", str, delimiter='\t')
top_k = net.blobs['softmax'].data[0].flatten().argsort()[::-1][:5]

for prob, class_ in zip(output_prob[top_k], labels[top_k]):
    print("prob={} class={}".format(prob, class_))

Output:

prob=1.0 class=n03250847 drumstick prob=0.0 class=n15075141 toilet tissue, toilet paper, bathroom tissue prob=0.0 class=n02317335 starfish, sea star prob=0.0 class=n02389026 sorrel prob=0.0 class=n02364673 guinea pig, Cavia cobaya

Here I am getting drumstick label with probabiliyt 1.0.
Please Help

kitstar commented 6 years ago

Hi @prashant-puri , the MXNet resnet model doesn't apply any preprocess method. You can remove the

transformer.set_mean('data', np.array([104, 117, 123]))
transformer.set_raw_scale('data', 255.0)
transformer.set_channel_swap('data', (2,1,0))     # Not sure about this one

and see if it works.

prashant-puri commented 6 years ago

Hi @kitstar really appreciate your answer. I comment all three line as suggested. I works but result are still bad, Please check below result after removing code you suggested. input image is same as before image_name = 'http://farm4.static.flickr.com/3170/2533026039_a4d72913ec.jpg'


prob=0.999646782875 class=n04286575 spotlight, spot
prob=0.000113813846838 class=n04515003 upright, upright piano
prob=7.75204971433e-05 class=n03759954 microphone, mike
prob=5.58655337954e-05 class=n03666591 lighter, light, igniter, ignitor
prob=4.96878346894e-05 class=n03483316 hand blower, blow dryer, blow drier, hair dryer, hair drier
kitstar commented 6 years ago

Hi @prashant-puri . Found the problem. Will fix soon and ping you asap. Thanks!

prashant-puri commented 6 years ago

@kitstar Thanks..Wating for your reply :)

kitstar commented 6 years ago

Hi @prashant-puri . Fixed. Please try the newest code again.