forresti / SqueezeNet

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters
BSD 2-Clause "Simplified" License
2.17k stars 723 forks source link

Replace global pooling with explicitly defined window #30

Closed psyhtest closed 7 years ago

psyhtest commented 7 years ago

While trying out the SqueezeNet variants (1.0 and 1.1) on a Jetson TX1 dev board with TensorRT 1.0.0, I got the following error:

Parameter check failed in addPooling, condition: windowSize.h > 0 && windowSize.w > 0 && windowSize.h*windowSize.w < MAX_KERNEL_DIMS_PRODUCT
error parsing layer type Pooling index 64

I believe this refers to the following layer in the definitions (identical in both variants):

layer {
  name: "pool10"
  type: "Pooling"
  bottom: "conv10"
  top: "pool10"
  pooling_param {
    pool: AVE
    global_pooling: true
  }
}

I've just got the following advice from NVIDIA:

TensorRT caffe parser doesn't support global pooling, so it's just taking the H and W parameters from the network definition, and those default to 0. The API check is complaining that there isn't a valid pooling layer definition. If you replace the global pooling with an explicitly defined window, TensorRT should work.

Alas, I'm not a Caffe expert, so I'm struggling a bit with how to do that. Can anyone suggest please how the SqueezeNet definitions should be updated, so as to maintain the recognition accuracy?

psyhtest commented 7 years ago

It appears somebody had hit the same issue when converting to TensorFlow: https://github.com/ethereon/caffe-tensorflow/issues/53#thread-subscription-status

I haven't tried their suggestions yet though.

psyhtest commented 7 years ago

I changed deploy.prototxt as suggested in the above TensorFlow issue like this:

diff --git a/package/caffemodel-deepscale-squeezenet-1.1/deploy.prototxt b/package/caffemodel-deepscale-squeezenet-1.1/deploy.prototxt
index 6fc204e..4d4e04d 100644
--- a/package/caffemodel-deepscale-squeezenet-1.1/deploy.prototxt
+++ b/package/caffemodel-deepscale-squeezenet-1.1/deploy.prototxt
@@ -547,7 +547,8 @@ layer {
   top: "pool10"
   pooling_param {
     pool: AVE
     global_pooling: true
+    kernel_size: 13
+    stride: 1
   }
 }
 layer {

and TensorRT's parser accepted it alright. Unfortunately, the classification was completely off (e.g. 1000 mis-predictions out of 1000). Any ideas?

fengziyong commented 7 years ago

@psyhtest I also found the output of average pooling in TensorRT is wrong. I think it's a bug of TensorRT.

psyhtest commented 7 years ago

@fengziyong, thanks! I should get in touch with someone from the TensorRT team soon, and will ask their opinion on this issue.

psyhtest commented 7 years ago

With help from @milakov from NVIDIA (thanks!), I've created two new CK-TensorRT packages that explicitly define the kernel size and stride for the average global pooling layer:

https://github.com/dividiti/ck-tensorrt/tree/master/package/caffemodel-deepscale-squeezenet-1.0-explicit-window-global-pooling

>     kernel_size: 15
>     stride: 15

https://github.com/dividiti/ck-tensorrt/tree/master/package/caffemodel-deepscale-squeezenet-1.1-explicit-window-global-pooling

>     kernel_size: 14
>     stride: 14

These changes make TensorRT 1.0.0 (GIE) happy!

Since these SqueezeNet packages are only intended as a workaround, they will be kept in the TensorRT repository. Note also that by default they install files into the same locations (~/CK_TOOLS/caffemodel-deepscale-squeezenet-1.{0,1}) as the SqueezeNet packages in the CK-Caffe repository.