implement adaptive noisy training

training with noisy activations resulted in much better conversion performance, especially with deep neural networks. however, training with too much noise also affects the pre-training performance and may prevent the network from converging to good pre-training accuracy.

like weight quantization and activation quantization adaptations, noise rate may also be adapted gradually. this can retain the pre-training accuracy while adapting the model for high noise resistance.

an additional noise adaptation step after pre-training will be a good solution

dyigitpolat / mimarsinan

implement adaptive noisy training #32