Hi,I‘m asking how long did you take to train the pyramidnet+shakedrop on cifar100.
And what is your experimental settings? How many nodes and GPUs you used? Used horovod for distributed computing? And I'm also wondering the batchsize of 64 is the total batchsize or on each GPU?
Hi,I‘m asking how long did you take to train the pyramidnet+shakedrop on cifar100. And what is your experimental settings? How many nodes and GPUs you used? Used horovod for distributed computing? And I'm also wondering the batchsize of 64 is the total batchsize or on each GPU?
Thank you !