Open wenhe-jia opened 6 years ago
Hi, @LeonJWH Have you tried to resume the training from this step?
@Zehaos I took your advice and tried to resume training from this step, and it goes well by now.
Is this problem occurred during your training progress? What is that for?
@LeonJWH Hi, I encountered this problem during my training, how do you resume training from this step?
@LeonJWH Hi, I encountered this problem during my training, how do you resume training from this step?
Hi, @Zehaos I've met the same error and I've tried BATCH_ROIS 128-> 64, still have the error
@zpp13 @chenmyzju I just commented out the code for training RPN1, and run bash scripts/train_alternate.sh
to resume training from generating RPN detection. And you should kill the progress on your GPU, sometimes the GPU memory won't be released after RPN training process is finished.
Have you guys checked out this another duplicated configuration out of the config.py?
@KaiyuYue yes, set a small batch_rois can reduce the GPU usage when training rcnn, but also get a low performance at the end. Check out in repo https://github.com/LeonJWH/mx-maskrcnn.
I encountered the "out of memory" problem during "# TRAIN RCNN WITH IMAGENET INIT AND RPN DETECTION" the error message is:
DeprecationWarning: Numeric-style type codes are deprecated and will result in an error in the future.
label.append(labels[self.label.index('rcnn_label_stride%s' % s)].asnumpy().reshape((-1,)).astype('Int32'))
Traceback (most recent call last):
File "train_alternate_mask_fpn.py", line 163, in
I changed the BATCH_ROIS 128->32. But useless. Does anybody know how to deal with it?
solved the problem by killing some "stopped "python process and changed the ROI to smaller.
@zhuaa What is the size of ROI after you modify it? I have also met this problem which cannot be solved by changing BATCH_ROIS to 64.
@zzw1123 did you solve the problem? I changed TRAIN.BATCH_ROIS=8 still didn't work
When generating RPN detection, after training RPN1, the processing turned down. The error message is shown as below:
I use 4 TITAN XP, with 1 image per GPU. I do not know where the problem is.