Closed ShaneYS closed 6 years ago
Not so sure why but how about trying out resnet50 as base network?
@ijkguo Thanks you. Maybe someone else takes up the gpu. I am using resnet101 and now have trained for 100000 batches without any problem. Another question. Can I ues rcnn-batch-size>1 to train? When I try to use rcnn-batch-size>1, there occurs the out of memory error.
It should work. Alert: batch size 2 -> memory consumption 2.
Thanks to your great job and now I can start to train mx-rcnn on OpenImages dataset. But there is still a problem. When I finetune the resnet101 with rcnn-batch-size>1, there will be an error : cudaMalloc failed: out of memory. Then I use rcnn-batch-size=1, training can go smoothly, but the problem (out of memory) still occurs after thousands of batches. I think I did not modify the batch size correctly. Can you tell me how to solve this problem? Thank you very much. My GPU is TiTAN XP x 4.