Hi! Thank you for publishing this. I'm following pytorch implementation of DenseNet, specifically I'm using densenet161 for extracting features from images. I'm wondering, in your implementation here, why after the last Denseblock are you adding additional transition layer, consisting of batch-normalization and ReLU? I don't see any notion of those operation in the paper. Am I missing something? I'm asking because I'm wondering how those transitions are influencing quality of learnt features when we use DenseNet not for classifying images but rather as feature extractor.
Thanks!
Hi! Thank you for publishing this. I'm following pytorch implementation of DenseNet, specifically I'm using densenet161 for extracting features from images. I'm wondering, in your implementation here, why after the last Denseblock are you adding additional transition layer, consisting of batch-normalization and ReLU? I don't see any notion of those operation in the paper. Am I missing something? I'm asking because I'm wondering how those transitions are influencing quality of learnt features when we use DenseNet not for classifying images but rather as feature extractor. Thanks!