ShichenLiu / CondenseNet

CondenseNet: Light weighted CNN for mobile devices
MIT License
694 stars 131 forks source link

No shuffle layer when training condensenet? #7

Closed hiyijian closed 6 years ago

hiyijian commented 6 years ago

Dear @ShichenLiu , I did not found any shuffle layer related stuff in models.condensenet, which use layers.LearnedGroupConv as LGC. However, the paper says we should use it clearly. Is it a mismatch? Thanks

ShichenLiu commented 6 years ago

Hi @hiyijian , Actually we implicitly include it in LearnedGroupConv for training speed consideration, so it is not a mismatch. Specifically, here we drop weights from shuffled convolution weights, which is equivalent to an appended shuffle layer. The shuffle layer is explicitly appended to CondenseConv here when converted.

hiyijian commented 6 years ago

Thanks. So shuffle layer is implicitly appended after the first stage. During the first stage, we dont need such a shuffle layer?

ShichenLiu commented 6 years ago

@hiyijian Yes. Only in testing stage do we need shuffle layers.

lizhenstat commented 5 years ago

@ShichenLiu , I am still confused about this question. Isn't the learned feature map has the learned mapping with the 1x1-conv-filter here? Why is shuffle layer is applied here?