I have noticed you have mentioned that it takes four hours to finish the training on three NVIDIA GTX 1080 Ti GPUs. However, you do not describe how to train the network on three GPUs in README.md.
When I run
python train.py --id resnet50_rnn --use_rnn
, it only takes a single GPU, and the batch size of it is eight which is different from that mentioned in your paper.
Could you please describe the process of training in detail.
I have noticed you have mentioned that it takes four hours to finish the training on three NVIDIA GTX 1080 Ti GPUs. However, you do not describe how to train the network on three GPUs in README.md. When I run
python train.py --id resnet50_rnn --use_rnn
, it only takes a single GPU, and the batch size of it is eight which is different from that mentioned in your paper. Could you please describe the process of training in detail.