matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.75k stars 11.72k forks source link

Differents results with GPU Nvidia P4000 (8GB) and GPU Nvidia P100 (16GB) #2749

Open felipetobars opened 2 years ago

felipetobars commented 2 years ago

I have trained a model from pre-trained coco weights, with approximately 960 training images and 320 validation images, applying data augmentation with imgaug and different training stages ('heads', '4 +', 'all' ). I did the training both on a pc with a P4000 GPU and in google colaboratory with a P100 GPU. The results showed overfitting in the Google Colab environment. The results are not much better with the GPU P4000 but I get better training and validation loss graphs, but my question is why are not better the results with a GPU with greater graphics memory capacity? I have tried changing the parameters of images per GPU, steps per epoch, decreasing the learning rate, but I still get similar results where the value of loss of validation begins to increase in the last training epochs with the P100. Are there any parameters that Mask R-CNN adjusts automatically according to computational capacity?

Google Colab (Tesla P100): image image Nvidia P4000: image image

I appreciate your responses.

monjurulkarim commented 2 years ago

@felipetobars I also faced with similar problem that I couldn't solve yet. Hope, someone in this forum may help us out.