Open swapniel99 opened 4 years ago
I'm having the same issue, an OOM error while transfer learning from Faster-RCNN-resnet101-coco using TFv1.15 and Python 3.6 on a Colab GPU. I rolled back to commit 73b5be67f8b9b70b46c5cfb7b6b69b0106b1b94c and it worked.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/blob/master/research/object_detection/model_main.py
https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/faster_rcnn_resnet101_coco.config
2. Describe the bug
I tried doing transfer learning on Faster-RCNN-resnet101-coco. Execution reaches till Step 0 and then crashes due to out of memory. This has started after below commit: 451906e4e82f19712455066c1b27e2a6ba71b1dd All commits before this didn't give this error. Looks like some issue with tfslim.
3. Steps to reproduce
Checkout later master branch. Attempt transfer learning on Faster-RCNN-Resnet101-COCO.
4. Expected behavior
Transfer learning steps happening successfully.
5. Additional context
Out of memory error.
6. System information
Environment: tf_env.txt