DenseNet architecture question

liuzhuang13 / DenseNet

Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

BSD 3-Clause "New" or "Revised" License

4.71k stars 1.07k forks source link

DenseNet architecture question #8

Closed suryabhupa closed 7 years ago

suryabhupa commented 7 years ago

I may be misunderstanding the architecture, but why does DenseNet decide to concatenate feature maps from the current layer to pass backward instead of using "true" residual connections?

liuzhuang13 commented 7 years ago

Thanks for your interests. Because we expect adding the layer activations will eliminate some information. Explicitly keeping all of previous layers' activations provides more useful information for later layers. That's the major difference between DenseNets and ResNets.

suryabhupa commented 7 years ago

Ah okay, this is what I suspected as well; do you think any computation saved by adding the layer activations would counteract the generality of concatenating the two?

liuzhuang13 commented 7 years ago

Our experiments suggested well-designed transformations (See our paper, DenseNet-BC structure) performed on concatenated features can save parameters and computation, compared with ResNet.

Maybe there's a balance between adding and concatenating that can maximize the computation savings.

suryabhupa commented 7 years ago

Ah okay, I see. That's great to hear! Thanks for the explanations (the paper is awesome!) 👍