Closed dyigitpolat closed 1 year ago
training with non-clamped activations is proven to be more effective. ClampedReLU should be introduced in an adaptation step.
this also gives the freedom of pre-training models with various activation functions
done
training with non-clamped activations is proven to be more effective. ClampedReLU should be introduced in an adaptation step.
this also gives the freedom of pre-training models with various activation functions