Closed chjw1475 closed 5 years ago
Hi, all the hyperparameters are set in the args class for CIFAR10, could you please tell what was your test accuracy at the end?
When I tried batch_size = 32, the test accuracy was around 0.1 at the Epoch 36. I think it was because I did not adjusted the learning rate according to the batch size.
The batch_size in the args class is 256. But when I try to train on a single 1080 ti GPU, the out-of-memory issue occurs. So, the CIFAR-10 model cannot be trained on a single GPU, is it right? Instead, I tried batch_size = 128 and the test accuracy is around 0.897 at the epoch 34.
You don't necessarily have to keep batch size 256. having 128 is fine. if you train till 100 epoch and after that train with hard_loss, you can converge to 91%.
We use 256, with four V100 GPUs in parallel, to speed up the training, but you can still use a single GPU to train the model.
Yes. I got 90.6 % test accuracy. Thanks a lot!
Great. Thank you very much.
Hi. Thank you for uploading your code. Would you let me know the hyperparameters for the CIFAR10?
I tried
model, eval_model = DeepCapsNet(input_shape=x_train.shape[1:], n_class=y_train.shape[1], routings=args.routings) # for 64*64 batch_size = 32
to train on single 1080 ti GPU. But it does not seem to converge. Thank you.