qinzheng93 / diagonalwise-refactorization-caffe

Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions (in Caffe)
Other
34 stars 2 forks source link

reimplementation mobilenet using diagonalwise-refactorization depthwise #2

Open austingg opened 6 years ago

austingg commented 6 years ago

hi, nice work. Have you reimplemented mobilenet v1/v2 based on this repository?

ghost commented 6 years ago

In fact, the original implementation of this work is based on darknet, and we have trained many models using that. The implementation in this repo is based on our implementation in darknet. I haven't train a MobileNet model based on this repo, but I did check the correctness of the depthwise convolutional layer, so I think it works.

Besides, saving models using this repo is a little tricky because I haven't found a proper way to switch models between diagonalwise refactorization method and other methods. Currently, I don't have time for this, so PR is welcome.

austingg commented 6 years ago

yes, i have found that, the cudnn depthwise's weights shape mismatch with other methods. when using checkpoint to initialize(such as resume train, or test accuracy), mismatch shape error would occur

ghost commented 6 years ago

Yes, but I think this is just an engineering problem and is not hard to solve. One simple way to solve this is to treat diagonalwise refactorization depthwise convolution and other methods as different layers, and a minor modification to the code can implement this.