fix fuse conv batchnorm subsitition bug

fix bug in fuse_conv_batchnorm_kernel and replace_node;

current fuse_conv_batchnorm only fuse gamma/sqrt(var+eps) with weight. ignore the bias beta - gamma*mean/sqrt(var + eps). the complete implementation as below:

1. fuse coeifficient to weight
2. do conv with new weight;
3. get bias with `beta - gamma*mean/sqrt(var + eps)`
4. broadcast add between conv.output and bias

the parameter of alpha, beta in batchnorm_kernel.cu/cudnnBatchNormalizationForwardInference seems like wrong. according to cudnn doc , alpha should be 0, beta should be 1. in this setting cudnnBatchNormalizationForwardInference is equal to bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias. But current setting is alpha=1, beta=0, which means identity.

jiazhihao / TASO

fix fuse conv batchnorm subsitition bug #47