Closed lizhenstat closed 4 years ago
Hi there,
I think you were testing with a wrong model. Here is our profile with the package you provided:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 16, 32, 32] 432
BatchNorm2d-2 [-1, 16, 32, 32] 32
ReLU-3 [-1, 16, 32, 32] 0
Conv2d-4 [-1, 32, 32, 32] 128
CondensingConv-5 [-1, 32, 32, 32] 0
BatchNorm2d-6 [-1, 32, 32, 32] 64
ReLU-7 [-1, 32, 32, 32] 0
Conv2d-8 [-1, 8, 32, 32] 576
_DenseLayer-9 [-1, 24, 32, 32] 0
BatchNorm2d-10 [-1, 24, 32, 32] 48
ReLU-11 [-1, 24, 32, 32] 0
Conv2d-12 [-1, 32, 32, 32] 192
CondensingConv-13 [-1, 32, 32, 32] 0
BatchNorm2d-14 [-1, 32, 32, 32] 64
ReLU-15 [-1, 32, 32, 32] 0
Conv2d-16 [-1, 8, 32, 32] 576
_DenseLayer-17 [-1, 32, 32, 32] 0
BatchNorm2d-18 [-1, 32, 32, 32] 64
ReLU-19 [-1, 32, 32, 32] 0
Conv2d-20 [-1, 32, 32, 32] 256
CondensingConv-21 [-1, 32, 32, 32] 0
BatchNorm2d-22 [-1, 32, 32, 32] 64
ReLU-23 [-1, 32, 32, 32] 0
Conv2d-24 [-1, 8, 32, 32] 576
_DenseLayer-25 [-1, 40, 32, 32] 0
BatchNorm2d-26 [-1, 40, 32, 32] 80
ReLU-27 [-1, 40, 32, 32] 0
Conv2d-28 [-1, 32, 32, 32] 320
CondensingConv-29 [-1, 32, 32, 32] 0
BatchNorm2d-30 [-1, 32, 32, 32] 64
ReLU-31 [-1, 32, 32, 32] 0
Conv2d-32 [-1, 8, 32, 32] 576
_DenseLayer-33 [-1, 48, 32, 32] 0
BatchNorm2d-34 [-1, 48, 32, 32] 96
ReLU-35 [-1, 48, 32, 32] 0
Conv2d-36 [-1, 32, 32, 32] 384
CondensingConv-37 [-1, 32, 32, 32] 0
BatchNorm2d-38 [-1, 32, 32, 32] 64
ReLU-39 [-1, 32, 32, 32] 0
Conv2d-40 [-1, 8, 32, 32] 576
_DenseLayer-41 [-1, 56, 32, 32] 0
BatchNorm2d-42 [-1, 56, 32, 32] 112
ReLU-43 [-1, 56, 32, 32] 0
Conv2d-44 [-1, 32, 32, 32] 448
CondensingConv-45 [-1, 32, 32, 32] 0
BatchNorm2d-46 [-1, 32, 32, 32] 64
ReLU-47 [-1, 32, 32, 32] 0
Conv2d-48 [-1, 8, 32, 32] 576
_DenseLayer-49 [-1, 64, 32, 32] 0
BatchNorm2d-50 [-1, 64, 32, 32] 128
ReLU-51 [-1, 64, 32, 32] 0
Conv2d-52 [-1, 32, 32, 32] 512
CondensingConv-53 [-1, 32, 32, 32] 0
BatchNorm2d-54 [-1, 32, 32, 32] 64
ReLU-55 [-1, 32, 32, 32] 0
Conv2d-56 [-1, 8, 32, 32] 576
_DenseLayer-57 [-1, 72, 32, 32] 0
BatchNorm2d-58 [-1, 72, 32, 32] 144
ReLU-59 [-1, 72, 32, 32] 0
Conv2d-60 [-1, 32, 32, 32] 576
CondensingConv-61 [-1, 32, 32, 32] 0
BatchNorm2d-62 [-1, 32, 32, 32] 64
ReLU-63 [-1, 32, 32, 32] 0
Conv2d-64 [-1, 8, 32, 32] 576
_DenseLayer-65 [-1, 80, 32, 32] 0
BatchNorm2d-66 [-1, 80, 32, 32] 160
ReLU-67 [-1, 80, 32, 32] 0
Conv2d-68 [-1, 32, 32, 32] 640
CondensingConv-69 [-1, 32, 32, 32] 0
BatchNorm2d-70 [-1, 32, 32, 32] 64
ReLU-71 [-1, 32, 32, 32] 0
Conv2d-72 [-1, 8, 32, 32] 576
_DenseLayer-73 [-1, 88, 32, 32] 0
BatchNorm2d-74 [-1, 88, 32, 32] 176
ReLU-75 [-1, 88, 32, 32] 0
Conv2d-76 [-1, 32, 32, 32] 704
CondensingConv-77 [-1, 32, 32, 32] 0
BatchNorm2d-78 [-1, 32, 32, 32] 64
ReLU-79 [-1, 32, 32, 32] 0
Conv2d-80 [-1, 8, 32, 32] 576
_DenseLayer-81 [-1, 96, 32, 32] 0
BatchNorm2d-82 [-1, 96, 32, 32] 192
ReLU-83 [-1, 96, 32, 32] 0
Conv2d-84 [-1, 32, 32, 32] 768
CondensingConv-85 [-1, 32, 32, 32] 0
BatchNorm2d-86 [-1, 32, 32, 32] 64
ReLU-87 [-1, 32, 32, 32] 0
Conv2d-88 [-1, 8, 32, 32] 576
_DenseLayer-89 [-1, 104, 32, 32] 0
BatchNorm2d-90 [-1, 104, 32, 32] 208
ReLU-91 [-1, 104, 32, 32] 0
Conv2d-92 [-1, 32, 32, 32] 832
CondensingConv-93 [-1, 32, 32, 32] 0
BatchNorm2d-94 [-1, 32, 32, 32] 64
ReLU-95 [-1, 32, 32, 32] 0
Conv2d-96 [-1, 8, 32, 32] 576
_DenseLayer-97 [-1, 112, 32, 32] 0
BatchNorm2d-98 [-1, 112, 32, 32] 224
ReLU-99 [-1, 112, 32, 32] 0
Conv2d-100 [-1, 32, 32, 32] 896
CondensingConv-101 [-1, 32, 32, 32] 0
BatchNorm2d-102 [-1, 32, 32, 32] 64
ReLU-103 [-1, 32, 32, 32] 0
Conv2d-104 [-1, 8, 32, 32] 576
_DenseLayer-105 [-1, 120, 32, 32] 0
BatchNorm2d-106 [-1, 120, 32, 32] 240
ReLU-107 [-1, 120, 32, 32] 0
Conv2d-108 [-1, 32, 32, 32] 960
CondensingConv-109 [-1, 32, 32, 32] 0
BatchNorm2d-110 [-1, 32, 32, 32] 64
ReLU-111 [-1, 32, 32, 32] 0
Conv2d-112 [-1, 8, 32, 32] 576
_DenseLayer-113 [-1, 128, 32, 32] 0
AvgPool2d-114 [-1, 128, 16, 16] 0
_Transition-115 [-1, 128, 16, 16] 0
BatchNorm2d-116 [-1, 128, 16, 16] 256
ReLU-117 [-1, 128, 16, 16] 0
Conv2d-118 [-1, 64, 16, 16] 2,048
CondensingConv-119 [-1, 64, 16, 16] 0
BatchNorm2d-120 [-1, 64, 16, 16] 128
ReLU-121 [-1, 64, 16, 16] 0
Conv2d-122 [-1, 16, 16, 16] 2,304
_DenseLayer-123 [-1, 144, 16, 16] 0
BatchNorm2d-124 [-1, 144, 16, 16] 288
ReLU-125 [-1, 144, 16, 16] 0
Conv2d-126 [-1, 64, 16, 16] 2,304
CondensingConv-127 [-1, 64, 16, 16] 0
BatchNorm2d-128 [-1, 64, 16, 16] 128
ReLU-129 [-1, 64, 16, 16] 0
Conv2d-130 [-1, 16, 16, 16] 2,304
_DenseLayer-131 [-1, 160, 16, 16] 0
BatchNorm2d-132 [-1, 160, 16, 16] 320
ReLU-133 [-1, 160, 16, 16] 0
Conv2d-134 [-1, 64, 16, 16] 2,560
CondensingConv-135 [-1, 64, 16, 16] 0
BatchNorm2d-136 [-1, 64, 16, 16] 128
ReLU-137 [-1, 64, 16, 16] 0
Conv2d-138 [-1, 16, 16, 16] 2,304
_DenseLayer-139 [-1, 176, 16, 16] 0
BatchNorm2d-140 [-1, 176, 16, 16] 352
ReLU-141 [-1, 176, 16, 16] 0
Conv2d-142 [-1, 64, 16, 16] 2,816
CondensingConv-143 [-1, 64, 16, 16] 0
BatchNorm2d-144 [-1, 64, 16, 16] 128
ReLU-145 [-1, 64, 16, 16] 0
Conv2d-146 [-1, 16, 16, 16] 2,304
_DenseLayer-147 [-1, 192, 16, 16] 0
BatchNorm2d-148 [-1, 192, 16, 16] 384
ReLU-149 [-1, 192, 16, 16] 0
Conv2d-150 [-1, 64, 16, 16] 3,072
CondensingConv-151 [-1, 64, 16, 16] 0
BatchNorm2d-152 [-1, 64, 16, 16] 128
ReLU-153 [-1, 64, 16, 16] 0
Conv2d-154 [-1, 16, 16, 16] 2,304
_DenseLayer-155 [-1, 208, 16, 16] 0
BatchNorm2d-156 [-1, 208, 16, 16] 416
ReLU-157 [-1, 208, 16, 16] 0
Conv2d-158 [-1, 64, 16, 16] 3,328
CondensingConv-159 [-1, 64, 16, 16] 0
BatchNorm2d-160 [-1, 64, 16, 16] 128
ReLU-161 [-1, 64, 16, 16] 0
Conv2d-162 [-1, 16, 16, 16] 2,304
_DenseLayer-163 [-1, 224, 16, 16] 0
BatchNorm2d-164 [-1, 224, 16, 16] 448
ReLU-165 [-1, 224, 16, 16] 0
Conv2d-166 [-1, 64, 16, 16] 3,584
CondensingConv-167 [-1, 64, 16, 16] 0
BatchNorm2d-168 [-1, 64, 16, 16] 128
ReLU-169 [-1, 64, 16, 16] 0
Conv2d-170 [-1, 16, 16, 16] 2,304
_DenseLayer-171 [-1, 240, 16, 16] 0
BatchNorm2d-172 [-1, 240, 16, 16] 480
ReLU-173 [-1, 240, 16, 16] 0
Conv2d-174 [-1, 64, 16, 16] 3,840
CondensingConv-175 [-1, 64, 16, 16] 0
BatchNorm2d-176 [-1, 64, 16, 16] 128
ReLU-177 [-1, 64, 16, 16] 0
Conv2d-178 [-1, 16, 16, 16] 2,304
_DenseLayer-179 [-1, 256, 16, 16] 0
BatchNorm2d-180 [-1, 256, 16, 16] 512
ReLU-181 [-1, 256, 16, 16] 0
Conv2d-182 [-1, 64, 16, 16] 4,096
CondensingConv-183 [-1, 64, 16, 16] 0
BatchNorm2d-184 [-1, 64, 16, 16] 128
ReLU-185 [-1, 64, 16, 16] 0
Conv2d-186 [-1, 16, 16, 16] 2,304
_DenseLayer-187 [-1, 272, 16, 16] 0
BatchNorm2d-188 [-1, 272, 16, 16] 544
ReLU-189 [-1, 272, 16, 16] 0
Conv2d-190 [-1, 64, 16, 16] 4,352
CondensingConv-191 [-1, 64, 16, 16] 0
BatchNorm2d-192 [-1, 64, 16, 16] 128
ReLU-193 [-1, 64, 16, 16] 0
Conv2d-194 [-1, 16, 16, 16] 2,304
_DenseLayer-195 [-1, 288, 16, 16] 0
BatchNorm2d-196 [-1, 288, 16, 16] 576
ReLU-197 [-1, 288, 16, 16] 0
Conv2d-198 [-1, 64, 16, 16] 4,608
CondensingConv-199 [-1, 64, 16, 16] 0
BatchNorm2d-200 [-1, 64, 16, 16] 128
ReLU-201 [-1, 64, 16, 16] 0
Conv2d-202 [-1, 16, 16, 16] 2,304
_DenseLayer-203 [-1, 304, 16, 16] 0
BatchNorm2d-204 [-1, 304, 16, 16] 608
ReLU-205 [-1, 304, 16, 16] 0
Conv2d-206 [-1, 64, 16, 16] 4,864
CondensingConv-207 [-1, 64, 16, 16] 0
BatchNorm2d-208 [-1, 64, 16, 16] 128
ReLU-209 [-1, 64, 16, 16] 0
Conv2d-210 [-1, 16, 16, 16] 2,304
_DenseLayer-211 [-1, 320, 16, 16] 0
BatchNorm2d-212 [-1, 320, 16, 16] 640
ReLU-213 [-1, 320, 16, 16] 0
Conv2d-214 [-1, 64, 16, 16] 5,120
CondensingConv-215 [-1, 64, 16, 16] 0
BatchNorm2d-216 [-1, 64, 16, 16] 128
ReLU-217 [-1, 64, 16, 16] 0
Conv2d-218 [-1, 16, 16, 16] 2,304
_DenseLayer-219 [-1, 336, 16, 16] 0
BatchNorm2d-220 [-1, 336, 16, 16] 672
ReLU-221 [-1, 336, 16, 16] 0
Conv2d-222 [-1, 64, 16, 16] 5,376
CondensingConv-223 [-1, 64, 16, 16] 0
BatchNorm2d-224 [-1, 64, 16, 16] 128
ReLU-225 [-1, 64, 16, 16] 0
Conv2d-226 [-1, 16, 16, 16] 2,304
_DenseLayer-227 [-1, 352, 16, 16] 0
AvgPool2d-228 [-1, 352, 8, 8] 0
_Transition-229 [-1, 352, 8, 8] 0
BatchNorm2d-230 [-1, 352, 8, 8] 704
ReLU-231 [-1, 352, 8, 8] 0
Conv2d-232 [-1, 128, 8, 8] 11,264
CondensingConv-233 [-1, 128, 8, 8] 0
BatchNorm2d-234 [-1, 128, 8, 8] 256
ReLU-235 [-1, 128, 8, 8] 0
Conv2d-236 [-1, 32, 8, 8] 9,216
_DenseLayer-237 [-1, 384, 8, 8] 0
BatchNorm2d-238 [-1, 384, 8, 8] 768
ReLU-239 [-1, 384, 8, 8] 0
Conv2d-240 [-1, 128, 8, 8] 12,288
CondensingConv-241 [-1, 128, 8, 8] 0
BatchNorm2d-242 [-1, 128, 8, 8] 256
ReLU-243 [-1, 128, 8, 8] 0
Conv2d-244 [-1, 32, 8, 8] 9,216
_DenseLayer-245 [-1, 416, 8, 8] 0
BatchNorm2d-246 [-1, 416, 8, 8] 832
ReLU-247 [-1, 416, 8, 8] 0
Conv2d-248 [-1, 128, 8, 8] 13,312
CondensingConv-249 [-1, 128, 8, 8] 0
BatchNorm2d-250 [-1, 128, 8, 8] 256
ReLU-251 [-1, 128, 8, 8] 0
Conv2d-252 [-1, 32, 8, 8] 9,216
_DenseLayer-253 [-1, 448, 8, 8] 0
BatchNorm2d-254 [-1, 448, 8, 8] 896
ReLU-255 [-1, 448, 8, 8] 0
Conv2d-256 [-1, 128, 8, 8] 14,336
CondensingConv-257 [-1, 128, 8, 8] 0
BatchNorm2d-258 [-1, 128, 8, 8] 256
ReLU-259 [-1, 128, 8, 8] 0
Conv2d-260 [-1, 32, 8, 8] 9,216
_DenseLayer-261 [-1, 480, 8, 8] 0
BatchNorm2d-262 [-1, 480, 8, 8] 960
ReLU-263 [-1, 480, 8, 8] 0
Conv2d-264 [-1, 128, 8, 8] 15,360
CondensingConv-265 [-1, 128, 8, 8] 0
BatchNorm2d-266 [-1, 128, 8, 8] 256
ReLU-267 [-1, 128, 8, 8] 0
Conv2d-268 [-1, 32, 8, 8] 9,216
_DenseLayer-269 [-1, 512, 8, 8] 0
BatchNorm2d-270 [-1, 512, 8, 8] 1,024
ReLU-271 [-1, 512, 8, 8] 0
Conv2d-272 [-1, 128, 8, 8] 16,384
CondensingConv-273 [-1, 128, 8, 8] 0
BatchNorm2d-274 [-1, 128, 8, 8] 256
ReLU-275 [-1, 128, 8, 8] 0
Conv2d-276 [-1, 32, 8, 8] 9,216
_DenseLayer-277 [-1, 544, 8, 8] 0
BatchNorm2d-278 [-1, 544, 8, 8] 1,088
ReLU-279 [-1, 544, 8, 8] 0
Conv2d-280 [-1, 128, 8, 8] 17,408
CondensingConv-281 [-1, 128, 8, 8] 0
BatchNorm2d-282 [-1, 128, 8, 8] 256
ReLU-283 [-1, 128, 8, 8] 0
Conv2d-284 [-1, 32, 8, 8] 9,216
_DenseLayer-285 [-1, 576, 8, 8] 0
BatchNorm2d-286 [-1, 576, 8, 8] 1,152
ReLU-287 [-1, 576, 8, 8] 0
Conv2d-288 [-1, 128, 8, 8] 18,432
CondensingConv-289 [-1, 128, 8, 8] 0
BatchNorm2d-290 [-1, 128, 8, 8] 256
ReLU-291 [-1, 128, 8, 8] 0
Conv2d-292 [-1, 32, 8, 8] 9,216
_DenseLayer-293 [-1, 608, 8, 8] 0
BatchNorm2d-294 [-1, 608, 8, 8] 1,216
ReLU-295 [-1, 608, 8, 8] 0
Conv2d-296 [-1, 128, 8, 8] 19,456
CondensingConv-297 [-1, 128, 8, 8] 0
BatchNorm2d-298 [-1, 128, 8, 8] 256
ReLU-299 [-1, 128, 8, 8] 0
Conv2d-300 [-1, 32, 8, 8] 9,216
_DenseLayer-301 [-1, 640, 8, 8] 0
BatchNorm2d-302 [-1, 640, 8, 8] 1,280
ReLU-303 [-1, 640, 8, 8] 0
Conv2d-304 [-1, 128, 8, 8] 20,480
CondensingConv-305 [-1, 128, 8, 8] 0
BatchNorm2d-306 [-1, 128, 8, 8] 256
ReLU-307 [-1, 128, 8, 8] 0
Conv2d-308 [-1, 32, 8, 8] 9,216
_DenseLayer-309 [-1, 672, 8, 8] 0
BatchNorm2d-310 [-1, 672, 8, 8] 1,344
ReLU-311 [-1, 672, 8, 8] 0
Conv2d-312 [-1, 128, 8, 8] 21,504
CondensingConv-313 [-1, 128, 8, 8] 0
BatchNorm2d-314 [-1, 128, 8, 8] 256
ReLU-315 [-1, 128, 8, 8] 0
Conv2d-316 [-1, 32, 8, 8] 9,216
_DenseLayer-317 [-1, 704, 8, 8] 0
BatchNorm2d-318 [-1, 704, 8, 8] 1,408
ReLU-319 [-1, 704, 8, 8] 0
Conv2d-320 [-1, 128, 8, 8] 22,528
CondensingConv-321 [-1, 128, 8, 8] 0
BatchNorm2d-322 [-1, 128, 8, 8] 256
ReLU-323 [-1, 128, 8, 8] 0
Conv2d-324 [-1, 32, 8, 8] 9,216
_DenseLayer-325 [-1, 736, 8, 8] 0
BatchNorm2d-326 [-1, 736, 8, 8] 1,472
ReLU-327 [-1, 736, 8, 8] 0
Conv2d-328 [-1, 128, 8, 8] 23,552
CondensingConv-329 [-1, 128, 8, 8] 0
BatchNorm2d-330 [-1, 128, 8, 8] 256
ReLU-331 [-1, 128, 8, 8] 0
Conv2d-332 [-1, 32, 8, 8] 9,216
_DenseLayer-333 [-1, 768, 8, 8] 0
BatchNorm2d-334 [-1, 768, 8, 8] 1,536
ReLU-335 [-1, 768, 8, 8] 0
Conv2d-336 [-1, 128, 8, 8] 24,576
CondensingConv-337 [-1, 128, 8, 8] 0
BatchNorm2d-338 [-1, 128, 8, 8] 256
ReLU-339 [-1, 128, 8, 8] 0
Conv2d-340 [-1, 32, 8, 8] 9,216
_DenseLayer-341 [-1, 800, 8, 8] 0
BatchNorm2d-342 [-1, 800, 8, 8] 1,600
ReLU-343 [-1, 800, 8, 8] 0
AvgPool2d-344 [-1, 800, 1, 1] 0
Linear-345 [-1, 10] 4,010
CondensingLinear-346 [-1, 10] 0
CondenseNet-347 [-1, 10] 0
================================================================
Total params: 516,202
Trainable params: 516,202
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 82.15
Params size (MB): 1.97
Estimated Total Size (MB): 84.13
----------------------------------------------------------------
Thanks!
Okay, I know where the difference comes from, the provided params are from converted model, right?
python main.py --model condensenet_converted -b 64 -j 2 cifar10 --epochs 300 --stages 14-14-14 --growth 8-16-32 --gpu 0 `
--evaluate-from E:/CondenseNet_log/results/savedir/save_models/converted_model_best.pth.tar
The one I submit in the issue is before the training epoch begin. Thanks a lot
Hi, I noticed that condensenet-86 on cifar10 is 0.52M on cifar10 However using torchsummary package, the total calculated params are as follow: the parameters are calculated as
Do you know why is there the difference? Thanks in advance