Closed timqqt closed 5 years ago
Hi, batch size of 1 is used to report memory requirements for all methods. The larger the batch size the better, since you want to minimize the variance of the gradients.
In the example scripts you can see the default batch size is 32 for the speed limits and 128 for the mnist experiment.
Let me know if I can help with anything more.
Cheers, Angelos
I am closing the issue but feel free to reopen it (or another one) if needed.
Angelos
Hey, I just want to repeat your work. I saw that your paper implied that the batch size for all experiments should be 1. However, I find that if I set the batch size to 1, I cannot get the same error as your experiments (about 10 times difference). But if I set the batch size to 32, I got decent result.
I would very appreciate your help to explain details in your experiments!