Tongcheng / DN_CaffeScript

128 stars 58 forks source link

Lack of two hyper-parameters for 'DenseNet-BC' support? #13

Open GuohongWu opened 6 years ago

GuohongWu commented 6 years ago

I have carefully read the authors' original paper. they mention their DenseNet-BC as a composition of 'DenseNet-B' and 'DenseNet-C' (see paper's sec-3 'Bottleneck layers' & 'Compression'). In 'DenseNet-B', a hyper-parameter should be needed to specify how many feature-maps each 1×1 convolution reduces the input to. In 'DenseNet-C', another hyper-parameter ( named 'theta', 0<theta<1) should be needed to specify the reducing ratio of the number of feature-maps through each transition layers (layers connecting adjacent DenseBlocks). I read your caffe source codes and .proto, but I didn't find such hyper-parameters. So how can I do to add two such hyper-parameters to implement the original 'DenseNet-BC'?

Tongcheng commented 6 years ago

Hello @GuohongWu , good question: for DenseNet-C, it is coded as a ConvolutionLayer in .prototxt whose numOutput is smaller. For DenseNet-B, we implicitly assume that the bottleneck channel = 4growthRate, it is some assumption in .cu code, because .cu code does some cudaMalloc itself, this can be a little harder to change, but the general rule is that usually something_4G denotes that this variable related to 4growthRate / DenseNet-B.

Thanks, Tongcheng Li