ShichenLiu / CondenseNet

CondenseNet: Light weighted CNN for mobile devices
MIT License
694 stars 131 forks source link

condensenet-86 parameters number different from torchsummary #31

Closed lizhenstat closed 4 years ago

lizhenstat commented 4 years ago

Hi, I noticed that condensenet-86 on cifar10 is 0.52M on cifar10 However using torchsummary package, the total calculated params are as follow: cifar10 the parameters are calculated as

    from torchsummary import summary
    summary(model, (3, 32, 32), device="cpu")
    exit(0)

Do you know why is there the difference? Thanks in advance

ShichenLiu commented 4 years ago

Hi there,

I think you were testing with a wrong model. Here is our profile with the package you provided:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 16, 32, 32]             432
       BatchNorm2d-2           [-1, 16, 32, 32]              32
              ReLU-3           [-1, 16, 32, 32]               0
            Conv2d-4           [-1, 32, 32, 32]             128
    CondensingConv-5           [-1, 32, 32, 32]               0
       BatchNorm2d-6           [-1, 32, 32, 32]              64
              ReLU-7           [-1, 32, 32, 32]               0
            Conv2d-8            [-1, 8, 32, 32]             576
       _DenseLayer-9           [-1, 24, 32, 32]               0
      BatchNorm2d-10           [-1, 24, 32, 32]              48
             ReLU-11           [-1, 24, 32, 32]               0
           Conv2d-12           [-1, 32, 32, 32]             192
   CondensingConv-13           [-1, 32, 32, 32]               0
      BatchNorm2d-14           [-1, 32, 32, 32]              64
             ReLU-15           [-1, 32, 32, 32]               0
           Conv2d-16            [-1, 8, 32, 32]             576
      _DenseLayer-17           [-1, 32, 32, 32]               0
      BatchNorm2d-18           [-1, 32, 32, 32]              64
             ReLU-19           [-1, 32, 32, 32]               0
           Conv2d-20           [-1, 32, 32, 32]             256
   CondensingConv-21           [-1, 32, 32, 32]               0
      BatchNorm2d-22           [-1, 32, 32, 32]              64
             ReLU-23           [-1, 32, 32, 32]               0
           Conv2d-24            [-1, 8, 32, 32]             576
      _DenseLayer-25           [-1, 40, 32, 32]               0
      BatchNorm2d-26           [-1, 40, 32, 32]              80
             ReLU-27           [-1, 40, 32, 32]               0
           Conv2d-28           [-1, 32, 32, 32]             320
   CondensingConv-29           [-1, 32, 32, 32]               0
      BatchNorm2d-30           [-1, 32, 32, 32]              64
             ReLU-31           [-1, 32, 32, 32]               0
           Conv2d-32            [-1, 8, 32, 32]             576
      _DenseLayer-33           [-1, 48, 32, 32]               0
      BatchNorm2d-34           [-1, 48, 32, 32]              96
             ReLU-35           [-1, 48, 32, 32]               0
           Conv2d-36           [-1, 32, 32, 32]             384
   CondensingConv-37           [-1, 32, 32, 32]               0
      BatchNorm2d-38           [-1, 32, 32, 32]              64
             ReLU-39           [-1, 32, 32, 32]               0
           Conv2d-40            [-1, 8, 32, 32]             576
      _DenseLayer-41           [-1, 56, 32, 32]               0
      BatchNorm2d-42           [-1, 56, 32, 32]             112
             ReLU-43           [-1, 56, 32, 32]               0
           Conv2d-44           [-1, 32, 32, 32]             448
   CondensingConv-45           [-1, 32, 32, 32]               0
      BatchNorm2d-46           [-1, 32, 32, 32]              64
             ReLU-47           [-1, 32, 32, 32]               0
           Conv2d-48            [-1, 8, 32, 32]             576
      _DenseLayer-49           [-1, 64, 32, 32]               0
      BatchNorm2d-50           [-1, 64, 32, 32]             128
             ReLU-51           [-1, 64, 32, 32]               0
           Conv2d-52           [-1, 32, 32, 32]             512
   CondensingConv-53           [-1, 32, 32, 32]               0
      BatchNorm2d-54           [-1, 32, 32, 32]              64
             ReLU-55           [-1, 32, 32, 32]               0
           Conv2d-56            [-1, 8, 32, 32]             576
      _DenseLayer-57           [-1, 72, 32, 32]               0
      BatchNorm2d-58           [-1, 72, 32, 32]             144
             ReLU-59           [-1, 72, 32, 32]               0
           Conv2d-60           [-1, 32, 32, 32]             576
   CondensingConv-61           [-1, 32, 32, 32]               0
      BatchNorm2d-62           [-1, 32, 32, 32]              64
             ReLU-63           [-1, 32, 32, 32]               0
           Conv2d-64            [-1, 8, 32, 32]             576
      _DenseLayer-65           [-1, 80, 32, 32]               0
      BatchNorm2d-66           [-1, 80, 32, 32]             160
             ReLU-67           [-1, 80, 32, 32]               0
           Conv2d-68           [-1, 32, 32, 32]             640
   CondensingConv-69           [-1, 32, 32, 32]               0
      BatchNorm2d-70           [-1, 32, 32, 32]              64
             ReLU-71           [-1, 32, 32, 32]               0
           Conv2d-72            [-1, 8, 32, 32]             576
      _DenseLayer-73           [-1, 88, 32, 32]               0
      BatchNorm2d-74           [-1, 88, 32, 32]             176
             ReLU-75           [-1, 88, 32, 32]               0
           Conv2d-76           [-1, 32, 32, 32]             704
   CondensingConv-77           [-1, 32, 32, 32]               0
      BatchNorm2d-78           [-1, 32, 32, 32]              64
             ReLU-79           [-1, 32, 32, 32]               0
           Conv2d-80            [-1, 8, 32, 32]             576
      _DenseLayer-81           [-1, 96, 32, 32]               0
      BatchNorm2d-82           [-1, 96, 32, 32]             192
             ReLU-83           [-1, 96, 32, 32]               0
           Conv2d-84           [-1, 32, 32, 32]             768
   CondensingConv-85           [-1, 32, 32, 32]               0
      BatchNorm2d-86           [-1, 32, 32, 32]              64
             ReLU-87           [-1, 32, 32, 32]               0
           Conv2d-88            [-1, 8, 32, 32]             576
      _DenseLayer-89          [-1, 104, 32, 32]               0
      BatchNorm2d-90          [-1, 104, 32, 32]             208
             ReLU-91          [-1, 104, 32, 32]               0
           Conv2d-92           [-1, 32, 32, 32]             832
   CondensingConv-93           [-1, 32, 32, 32]               0
      BatchNorm2d-94           [-1, 32, 32, 32]              64
             ReLU-95           [-1, 32, 32, 32]               0
           Conv2d-96            [-1, 8, 32, 32]             576
      _DenseLayer-97          [-1, 112, 32, 32]               0
      BatchNorm2d-98          [-1, 112, 32, 32]             224
             ReLU-99          [-1, 112, 32, 32]               0
          Conv2d-100           [-1, 32, 32, 32]             896
  CondensingConv-101           [-1, 32, 32, 32]               0
     BatchNorm2d-102           [-1, 32, 32, 32]              64
            ReLU-103           [-1, 32, 32, 32]               0
          Conv2d-104            [-1, 8, 32, 32]             576
     _DenseLayer-105          [-1, 120, 32, 32]               0
     BatchNorm2d-106          [-1, 120, 32, 32]             240
            ReLU-107          [-1, 120, 32, 32]               0
          Conv2d-108           [-1, 32, 32, 32]             960
  CondensingConv-109           [-1, 32, 32, 32]               0
     BatchNorm2d-110           [-1, 32, 32, 32]              64
            ReLU-111           [-1, 32, 32, 32]               0
          Conv2d-112            [-1, 8, 32, 32]             576
     _DenseLayer-113          [-1, 128, 32, 32]               0
       AvgPool2d-114          [-1, 128, 16, 16]               0
     _Transition-115          [-1, 128, 16, 16]               0
     BatchNorm2d-116          [-1, 128, 16, 16]             256
            ReLU-117          [-1, 128, 16, 16]               0
          Conv2d-118           [-1, 64, 16, 16]           2,048
  CondensingConv-119           [-1, 64, 16, 16]               0
     BatchNorm2d-120           [-1, 64, 16, 16]             128
            ReLU-121           [-1, 64, 16, 16]               0
          Conv2d-122           [-1, 16, 16, 16]           2,304
     _DenseLayer-123          [-1, 144, 16, 16]               0
     BatchNorm2d-124          [-1, 144, 16, 16]             288
            ReLU-125          [-1, 144, 16, 16]               0
          Conv2d-126           [-1, 64, 16, 16]           2,304
  CondensingConv-127           [-1, 64, 16, 16]               0
     BatchNorm2d-128           [-1, 64, 16, 16]             128
            ReLU-129           [-1, 64, 16, 16]               0
          Conv2d-130           [-1, 16, 16, 16]           2,304
     _DenseLayer-131          [-1, 160, 16, 16]               0
     BatchNorm2d-132          [-1, 160, 16, 16]             320
            ReLU-133          [-1, 160, 16, 16]               0
          Conv2d-134           [-1, 64, 16, 16]           2,560
  CondensingConv-135           [-1, 64, 16, 16]               0
     BatchNorm2d-136           [-1, 64, 16, 16]             128
            ReLU-137           [-1, 64, 16, 16]               0
          Conv2d-138           [-1, 16, 16, 16]           2,304
     _DenseLayer-139          [-1, 176, 16, 16]               0
     BatchNorm2d-140          [-1, 176, 16, 16]             352
            ReLU-141          [-1, 176, 16, 16]               0
          Conv2d-142           [-1, 64, 16, 16]           2,816
  CondensingConv-143           [-1, 64, 16, 16]               0
     BatchNorm2d-144           [-1, 64, 16, 16]             128
            ReLU-145           [-1, 64, 16, 16]               0
          Conv2d-146           [-1, 16, 16, 16]           2,304
     _DenseLayer-147          [-1, 192, 16, 16]               0
     BatchNorm2d-148          [-1, 192, 16, 16]             384
            ReLU-149          [-1, 192, 16, 16]               0
          Conv2d-150           [-1, 64, 16, 16]           3,072
  CondensingConv-151           [-1, 64, 16, 16]               0
     BatchNorm2d-152           [-1, 64, 16, 16]             128
            ReLU-153           [-1, 64, 16, 16]               0
          Conv2d-154           [-1, 16, 16, 16]           2,304
     _DenseLayer-155          [-1, 208, 16, 16]               0
     BatchNorm2d-156          [-1, 208, 16, 16]             416
            ReLU-157          [-1, 208, 16, 16]               0
          Conv2d-158           [-1, 64, 16, 16]           3,328
  CondensingConv-159           [-1, 64, 16, 16]               0
     BatchNorm2d-160           [-1, 64, 16, 16]             128
            ReLU-161           [-1, 64, 16, 16]               0
          Conv2d-162           [-1, 16, 16, 16]           2,304
     _DenseLayer-163          [-1, 224, 16, 16]               0
     BatchNorm2d-164          [-1, 224, 16, 16]             448
            ReLU-165          [-1, 224, 16, 16]               0
          Conv2d-166           [-1, 64, 16, 16]           3,584
  CondensingConv-167           [-1, 64, 16, 16]               0
     BatchNorm2d-168           [-1, 64, 16, 16]             128
            ReLU-169           [-1, 64, 16, 16]               0
          Conv2d-170           [-1, 16, 16, 16]           2,304
     _DenseLayer-171          [-1, 240, 16, 16]               0
     BatchNorm2d-172          [-1, 240, 16, 16]             480
            ReLU-173          [-1, 240, 16, 16]               0
          Conv2d-174           [-1, 64, 16, 16]           3,840
  CondensingConv-175           [-1, 64, 16, 16]               0
     BatchNorm2d-176           [-1, 64, 16, 16]             128
            ReLU-177           [-1, 64, 16, 16]               0
          Conv2d-178           [-1, 16, 16, 16]           2,304
     _DenseLayer-179          [-1, 256, 16, 16]               0
     BatchNorm2d-180          [-1, 256, 16, 16]             512
            ReLU-181          [-1, 256, 16, 16]               0
          Conv2d-182           [-1, 64, 16, 16]           4,096
  CondensingConv-183           [-1, 64, 16, 16]               0
     BatchNorm2d-184           [-1, 64, 16, 16]             128
            ReLU-185           [-1, 64, 16, 16]               0
          Conv2d-186           [-1, 16, 16, 16]           2,304
     _DenseLayer-187          [-1, 272, 16, 16]               0
     BatchNorm2d-188          [-1, 272, 16, 16]             544
            ReLU-189          [-1, 272, 16, 16]               0
          Conv2d-190           [-1, 64, 16, 16]           4,352
  CondensingConv-191           [-1, 64, 16, 16]               0
     BatchNorm2d-192           [-1, 64, 16, 16]             128
            ReLU-193           [-1, 64, 16, 16]               0
          Conv2d-194           [-1, 16, 16, 16]           2,304
     _DenseLayer-195          [-1, 288, 16, 16]               0
     BatchNorm2d-196          [-1, 288, 16, 16]             576
            ReLU-197          [-1, 288, 16, 16]               0
          Conv2d-198           [-1, 64, 16, 16]           4,608
  CondensingConv-199           [-1, 64, 16, 16]               0
     BatchNorm2d-200           [-1, 64, 16, 16]             128
            ReLU-201           [-1, 64, 16, 16]               0
          Conv2d-202           [-1, 16, 16, 16]           2,304
     _DenseLayer-203          [-1, 304, 16, 16]               0
     BatchNorm2d-204          [-1, 304, 16, 16]             608
            ReLU-205          [-1, 304, 16, 16]               0
          Conv2d-206           [-1, 64, 16, 16]           4,864
  CondensingConv-207           [-1, 64, 16, 16]               0
     BatchNorm2d-208           [-1, 64, 16, 16]             128
            ReLU-209           [-1, 64, 16, 16]               0
          Conv2d-210           [-1, 16, 16, 16]           2,304
     _DenseLayer-211          [-1, 320, 16, 16]               0
     BatchNorm2d-212          [-1, 320, 16, 16]             640
            ReLU-213          [-1, 320, 16, 16]               0
          Conv2d-214           [-1, 64, 16, 16]           5,120
  CondensingConv-215           [-1, 64, 16, 16]               0
     BatchNorm2d-216           [-1, 64, 16, 16]             128
            ReLU-217           [-1, 64, 16, 16]               0
          Conv2d-218           [-1, 16, 16, 16]           2,304
     _DenseLayer-219          [-1, 336, 16, 16]               0
     BatchNorm2d-220          [-1, 336, 16, 16]             672
            ReLU-221          [-1, 336, 16, 16]               0
          Conv2d-222           [-1, 64, 16, 16]           5,376
  CondensingConv-223           [-1, 64, 16, 16]               0
     BatchNorm2d-224           [-1, 64, 16, 16]             128
            ReLU-225           [-1, 64, 16, 16]               0
          Conv2d-226           [-1, 16, 16, 16]           2,304
     _DenseLayer-227          [-1, 352, 16, 16]               0
       AvgPool2d-228            [-1, 352, 8, 8]               0
     _Transition-229            [-1, 352, 8, 8]               0
     BatchNorm2d-230            [-1, 352, 8, 8]             704
            ReLU-231            [-1, 352, 8, 8]               0
          Conv2d-232            [-1, 128, 8, 8]          11,264
  CondensingConv-233            [-1, 128, 8, 8]               0
     BatchNorm2d-234            [-1, 128, 8, 8]             256
            ReLU-235            [-1, 128, 8, 8]               0
          Conv2d-236             [-1, 32, 8, 8]           9,216
     _DenseLayer-237            [-1, 384, 8, 8]               0
     BatchNorm2d-238            [-1, 384, 8, 8]             768
            ReLU-239            [-1, 384, 8, 8]               0
          Conv2d-240            [-1, 128, 8, 8]          12,288
  CondensingConv-241            [-1, 128, 8, 8]               0
     BatchNorm2d-242            [-1, 128, 8, 8]             256
            ReLU-243            [-1, 128, 8, 8]               0
          Conv2d-244             [-1, 32, 8, 8]           9,216
     _DenseLayer-245            [-1, 416, 8, 8]               0
     BatchNorm2d-246            [-1, 416, 8, 8]             832
            ReLU-247            [-1, 416, 8, 8]               0
          Conv2d-248            [-1, 128, 8, 8]          13,312
  CondensingConv-249            [-1, 128, 8, 8]               0
     BatchNorm2d-250            [-1, 128, 8, 8]             256
            ReLU-251            [-1, 128, 8, 8]               0
          Conv2d-252             [-1, 32, 8, 8]           9,216
     _DenseLayer-253            [-1, 448, 8, 8]               0
     BatchNorm2d-254            [-1, 448, 8, 8]             896
            ReLU-255            [-1, 448, 8, 8]               0
          Conv2d-256            [-1, 128, 8, 8]          14,336
  CondensingConv-257            [-1, 128, 8, 8]               0
     BatchNorm2d-258            [-1, 128, 8, 8]             256
            ReLU-259            [-1, 128, 8, 8]               0
          Conv2d-260             [-1, 32, 8, 8]           9,216
     _DenseLayer-261            [-1, 480, 8, 8]               0
     BatchNorm2d-262            [-1, 480, 8, 8]             960
            ReLU-263            [-1, 480, 8, 8]               0
          Conv2d-264            [-1, 128, 8, 8]          15,360
  CondensingConv-265            [-1, 128, 8, 8]               0
     BatchNorm2d-266            [-1, 128, 8, 8]             256
            ReLU-267            [-1, 128, 8, 8]               0
          Conv2d-268             [-1, 32, 8, 8]           9,216
     _DenseLayer-269            [-1, 512, 8, 8]               0
     BatchNorm2d-270            [-1, 512, 8, 8]           1,024
            ReLU-271            [-1, 512, 8, 8]               0
          Conv2d-272            [-1, 128, 8, 8]          16,384
  CondensingConv-273            [-1, 128, 8, 8]               0
     BatchNorm2d-274            [-1, 128, 8, 8]             256
            ReLU-275            [-1, 128, 8, 8]               0
          Conv2d-276             [-1, 32, 8, 8]           9,216
     _DenseLayer-277            [-1, 544, 8, 8]               0
     BatchNorm2d-278            [-1, 544, 8, 8]           1,088
            ReLU-279            [-1, 544, 8, 8]               0
          Conv2d-280            [-1, 128, 8, 8]          17,408
  CondensingConv-281            [-1, 128, 8, 8]               0
     BatchNorm2d-282            [-1, 128, 8, 8]             256
            ReLU-283            [-1, 128, 8, 8]               0
          Conv2d-284             [-1, 32, 8, 8]           9,216
     _DenseLayer-285            [-1, 576, 8, 8]               0
     BatchNorm2d-286            [-1, 576, 8, 8]           1,152
            ReLU-287            [-1, 576, 8, 8]               0
          Conv2d-288            [-1, 128, 8, 8]          18,432
  CondensingConv-289            [-1, 128, 8, 8]               0
     BatchNorm2d-290            [-1, 128, 8, 8]             256
            ReLU-291            [-1, 128, 8, 8]               0
          Conv2d-292             [-1, 32, 8, 8]           9,216
     _DenseLayer-293            [-1, 608, 8, 8]               0
     BatchNorm2d-294            [-1, 608, 8, 8]           1,216
            ReLU-295            [-1, 608, 8, 8]               0
          Conv2d-296            [-1, 128, 8, 8]          19,456
  CondensingConv-297            [-1, 128, 8, 8]               0
     BatchNorm2d-298            [-1, 128, 8, 8]             256
            ReLU-299            [-1, 128, 8, 8]               0
          Conv2d-300             [-1, 32, 8, 8]           9,216
     _DenseLayer-301            [-1, 640, 8, 8]               0
     BatchNorm2d-302            [-1, 640, 8, 8]           1,280
            ReLU-303            [-1, 640, 8, 8]               0
          Conv2d-304            [-1, 128, 8, 8]          20,480
  CondensingConv-305            [-1, 128, 8, 8]               0
     BatchNorm2d-306            [-1, 128, 8, 8]             256
            ReLU-307            [-1, 128, 8, 8]               0
          Conv2d-308             [-1, 32, 8, 8]           9,216
     _DenseLayer-309            [-1, 672, 8, 8]               0
     BatchNorm2d-310            [-1, 672, 8, 8]           1,344
            ReLU-311            [-1, 672, 8, 8]               0
          Conv2d-312            [-1, 128, 8, 8]          21,504
  CondensingConv-313            [-1, 128, 8, 8]               0
     BatchNorm2d-314            [-1, 128, 8, 8]             256
            ReLU-315            [-1, 128, 8, 8]               0
          Conv2d-316             [-1, 32, 8, 8]           9,216
     _DenseLayer-317            [-1, 704, 8, 8]               0
     BatchNorm2d-318            [-1, 704, 8, 8]           1,408
            ReLU-319            [-1, 704, 8, 8]               0
          Conv2d-320            [-1, 128, 8, 8]          22,528
  CondensingConv-321            [-1, 128, 8, 8]               0
     BatchNorm2d-322            [-1, 128, 8, 8]             256
            ReLU-323            [-1, 128, 8, 8]               0
          Conv2d-324             [-1, 32, 8, 8]           9,216
     _DenseLayer-325            [-1, 736, 8, 8]               0
     BatchNorm2d-326            [-1, 736, 8, 8]           1,472
            ReLU-327            [-1, 736, 8, 8]               0
          Conv2d-328            [-1, 128, 8, 8]          23,552
  CondensingConv-329            [-1, 128, 8, 8]               0
     BatchNorm2d-330            [-1, 128, 8, 8]             256
            ReLU-331            [-1, 128, 8, 8]               0
          Conv2d-332             [-1, 32, 8, 8]           9,216
     _DenseLayer-333            [-1, 768, 8, 8]               0
     BatchNorm2d-334            [-1, 768, 8, 8]           1,536
            ReLU-335            [-1, 768, 8, 8]               0
          Conv2d-336            [-1, 128, 8, 8]          24,576
  CondensingConv-337            [-1, 128, 8, 8]               0
     BatchNorm2d-338            [-1, 128, 8, 8]             256
            ReLU-339            [-1, 128, 8, 8]               0
          Conv2d-340             [-1, 32, 8, 8]           9,216
     _DenseLayer-341            [-1, 800, 8, 8]               0
     BatchNorm2d-342            [-1, 800, 8, 8]           1,600
            ReLU-343            [-1, 800, 8, 8]               0
       AvgPool2d-344            [-1, 800, 1, 1]               0
          Linear-345                   [-1, 10]           4,010
CondensingLinear-346                   [-1, 10]               0
     CondenseNet-347                   [-1, 10]               0
================================================================
Total params: 516,202
Trainable params: 516,202
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 82.15
Params size (MB): 1.97
Estimated Total Size (MB): 84.13
----------------------------------------------------------------

Thanks!

lizhenstat commented 4 years ago

Okay, I know where the difference comes from, the provided params are from converted model, right?

python main.py --model condensenet_converted -b 64 -j 2 cifar10 --epochs 300 --stages 14-14-14 --growth 8-16-32 --gpu 0 `
--evaluate-from E:/CondenseNet_log/results/savedir/save_models/converted_model_best.pth.tar

The one I submit in the issue is before the training epoch begin. Thanks a lot