Closed go2star closed 6 years ago
Could you re-articulate the problem? What do you mean by 1) Training would be broken down? 2) The RoI I used belongs to the T frame for the correlation features and the pair of frames feature map? (Make sure you use only those ROI's for tracking loss that have object correspondence between frame T and T+t)
My training process is as follows. First I trained the R-FCN detector end-to-end which means the RPN and RFCN networks are merged into one network during training. Then I finetuned the D&T artecture based on the above model. For RoI tracking, I used the foreground RoIs of the T frame which have correspondence in T+t frame. For foreground RoIs, the output channel of convolution layer before position sensitive pooing is 196 = 4*7^2. After above setting, the training process would be broken down. I found that the layer 'rpn_bbox_pred' which output the bbox_deltas would be NaN. I do not know the reason. When finetuning the tracking loss, is RPN loss necessary? Shall I abort the supervision of RPN loss?
When I finetuned the RoI tracking part on the trained RFCN detector, the model training would be broken down. The RoI I used belongs to the T frame for the correlation features and the pair of frames' feature map. The initiate learning rate I set is 0.0001. How can I solve it?