liuzhuang13 / DenseNet

Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).
BSD 3-Clause "New" or "Revised" License
4.71k stars 1.07k forks source link

Deep-Narrow DenseNet #18

Closed cgarciae closed 7 years ago

cgarciae commented 7 years ago

I was wondering if you ever tried the extreme case growth_rate = 1 with a very deep network? Just as an exercise I implemented a fully-connected dense block with growth_rate = 1 and depth = 50 on a 2D dataset so I could visualize what each neuron was learning, the results where very nice.

liuzhuang13 commented 7 years ago

Thanks for your interest. Yes, at the beginning of the project, we mostly set growth_rate = 1. It works well but the memory and time consumption is too high compared with some reasonably large growth_rate. So finally we use growth_rate like 12, 24.

Yes, visualizing would be more interesting if you set growth_rate = 1, because in each layer you only have one feature map, and you expect to see some trend from the feature maps. In fact, we tried this same experiment, but there's no obvious trend.

cgarciae commented 7 years ago

Thanks for the response! Nice to see that you tried this. Since my experiment was on a dataset of only 2 variables using a dense block of fully connected layers, I could just pass the whole 2D mesh and see the complete set of activations of each neuron. Obviously the first layers where very simple, then they became more complex, but some where focused on really specific features, some where very similar to previous layers but slightly changed, I think this redundancy helps generalization. The geometry of the activation map of the final layer on the 2D mesh looked more "ergonomic" than a sequential NN with a similar number of parameters.

BTW: on a more serious experiment I used a DenseNet with less parameters than SqueezeNet and got better results. However, DenseNet is very resource hungry. I think wider DenseNets should be researched to optimize resources.

Thanks for your time!

liuzhuang13 commented 7 years ago

Hi, it's very interesting to see that your visualization makes sense!

Yes, we agree that wider DenseNets are more practically useful, that's why we updated our readme file and posted the results of wider DenseNets recently. We hope people realize this and deploy a more efficient model in their application.