use fused batch norm and make gamma optional

buriburisuri / sugartensor

A slim tensorflow wrapper that provides syntactic sugar for tensor variables. This library will be helpful for practical deep learning researchers not beginners.

MIT License

372 stars 63 forks source link

use fused batch norm and make gamma optional #25

Closed AndreasMadsen closed 7 years ago

AndreasMadsen commented 7 years ago

Fused batch norm is recommended in the tensorflow/performance_guide. I couldn't use tf.contrib.layers.batch_norm because it creates its own variables and I couldn't get reuse=True to work.

It also makes gamma optional since that is redundant when the activation function is multiplicative linear (e.g. ReLU in ByteNet).

I see approximately a 20% better performance in ByteNet when using these optimizations.

#19 can be closed when this is merged.

buriburisuri commented 7 years ago

@AndreasMadsen Wow~ this is great contribution! Thank you very much.