Closed shifaspv closed 6 years ago
I fixed the issue by setting the learning rate to very low values. the batch size reduction makes stochasticity in gradient estimate, so would be better to set lower rate at lower batch size.
Hi Shifaspv, yes the batch size does have an influence during training since the gradient that gets backpropagated is the average of the gradients of that batch. So a larger batch will, in general, result in a smoother gradient, and vice versa. However, during inference the batch size does not have an effect on the output of the network.
@shifaspv what learning rate did you use for training with batch size of 5?
Hi all, @drethage, Thanks for replay. when I trained with batch size 5 the training stops after 127 epochs because of early stopping criterion. The model is not as good as the one you have displayed, Does the batch size really matter for final model returns?
@rafaelvalle, I have trained by setting learning rate as 0.0001, for batch size 5.
@shifaspv thanks!
Hi, initially I could not train the model using the batch size 10, due to the memory issue. Then I started training model using batch size of 5 (edits the config.json file) without touching any other parms, The algorithm stops after 41 epochs due to the early stopping condition. I used the 41th check point as the model parameters while testing, the returned denoised wave file is empty (full silence).
what would be the problem? is the early stopping causes the null model? does the batch size really matter while testing a model?