hpi-xnor / BMXNet

(New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Apache License 2.0
349 stars 95 forks source link

Order of QActivation, QConvolution layers #49

Closed simonmaurer closed 5 years ago

simonmaurer commented 6 years ago

@yanghaojin

  1. could you quickly elaborate why the QActivation (as referenced in the code snippet of the paper) is in front of the QActivation/QFullyConnected layers ? for example why is there another activation layer after the binarized one: ba2 = mx.symbol.QActivation(..) fc1 = mx.symbol.QFullyConnected(..) bn3 = mx . sym . BatchNorm (...) tanh3 = mx . sym . Activation (...)
  2. could one use the mx.symbol.LeakyReLU or would you suggest to implement activation functions like Prelu/Swish (as supported by Gluon API) for binary networks in the underlying C/C++ src code?
  3. for a project we're especially interested in running inference in C++, are both c_predict_api.h and mxnet-cpp/MxNetCpp.h as in https://github.com/apache/incubator-mxnet/blob/master/cpp-package/example/feature_extract/feature_extract.cpp compatible with BMXNet ?
yanghaojin commented 6 years ago
  1. you can also remove the tanh and relu activation, just apply binary activation. We found that adding a relu activation after each residual block in resnet architecture, we could slightly improve the accuracy.
  2. if you want to deploy binary model on some low power devices which only support c/c++, you probably have to do this.
  3. c_predict_api.h should work If you could adapt the corresponding makefile/cmake file to include bmxnet specific sources., for feature_extract.cpp, I didn't check it, but if the standard mxnet convolution layers work with it, there is no reason why the QConvolution layer can not.