huawei-noah / Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
4.05k stars 707 forks source link

Why did you exclude EfficientNetB0 from Accuracy-Latency chart? #1

Closed AlexeyAB closed 4 years ago

AlexeyAB commented 4 years ago

@iamhankai Hi,

Great work!

  1. Why did you exclude EfficientNetB0 (0.390 BFlops - 76.3% Top1) from Accuracy-Latency chart?

  2. Also what mini_batch_size did you use for training GhostNet?

flops_latency

iamhankai commented 4 years ago

Actually, we have tested the latency of Efficient-B0. The latency is too large (~98ms) which cannot be put inside the current chart.

iamhankai commented 4 years ago

In addition, we have also tested the latency of MixNet (https://github.com/AlexeyAB/darknet/issues/4503) and the latency is also too large (>85ms). The various kernel size in the same depthwise conv layer is harmful for the inference speed.

iamhankai commented 4 years ago
  1. we used mini_batch_size=1024 for training on 8 GPUs.
AlexeyAB commented 4 years ago

@iamhankai Thanks.

So GhostNet looks much more promising.

AlexeyAB commented 4 years ago

As I understand, GhostBlock is just Conv2D + depthwise_conv2d + concat ?

iamhankai commented 4 years ago

As I understand, GhostBlock is just Conv2D + depthwise_conv2d + concat ?

Yes. With these efficient operators, GhostNet can be simple yet fast.

AlexeyAB commented 4 years ago

@iamhankai Thanks for your answers and SOTA network!

  1. Why are you duplicating convolution stride = 2?

  1. Why do you specify dropout probability several times, but never use it?

  1. Why don’t you perform relu after shortcut (residual-connection)? https://github.com/iamhankai/ghostnet/blob/47ef752446ba761dc5342ce06cbc26537b038289/ghost_net.py#L281

  1. Why do you calculate out_channel but never use it? https://github.com/iamhankai/ghostnet/blob/47ef752446ba761dc5342ce06cbc26537b038289/myconv2d.py#L29

  1. As I see, the main decrease in BFLOPS (-60 M Flops) is achieved by moving the Conv2D(1280 filters)-layer after slim.avg_pool2d-layer: https://github.com/iamhankai/ghostnet/blob/47ef752446ba761dc5342ce06cbc26537b038289/ghost_net.py#L218-L234 compared to:
iamhankai commented 4 years ago
  1. using stride=2 in both the shortcut branch and the main branch can result in feature maps of the same size, so they can be added.
  2. The dropout is set dropout_keep_prob=0.8 in our Ghost 1.0x.
  3. Follow MobileNetV2.
  4. The code isn't clean enough, sorry for that.
  5. Follow MobileNetV3.