[09/13 20:20:43 fsdet]: Full config saved to /home/wangyufei/Code/FSCE/checkpoints/voc/faster_rcnn/faster_rcnn_R_101_FPN_base1/config.yaml
[09/13 20:20:43 fsdet.utils.env]: Using a generated random seed 43966721
frozen resnet backbone stage 2 (this froze ResNet but not FPN, frozen backbone in rcnn.py will overwrite this)
frozen resnet backbone stage 2 (this froze ResNet but not FPN, frozen backbone in rcnn.py will overwrite this)
-------- Using Roi Head: StandardROIHeads---------
-------- Using Roi Head: StandardROIHeads---------
[09/13 20:22:41 fsdet.data.build]: Removed 1920 images with no usable annotations. 14631 images left.
[09/13 20:22:41 fsdet.data.build]: Distribution of training instances among all 15 categories:
category
#instances
category
#instances
category
#instances
aeroplane
1285
bicycle
1208
boat
1397
bottle
2116
car
4008
cat
1616
chair
4338
diningtable
1057
dog
2079
horse
1156
person
15576
pottedplant
1724
sheep
1347
train
984
tvmonitor
1193
total
41084
[09/13 20:22:41 fsdet.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[09/13 20:22:41 fsdet.data.build]: Using training sampler TrainingSampler
[09/13 20:23:14 fvcore.common.checkpoint]: Loading checkpoint from checkpoints/pretrained_model/R-101.pkl
[09/13 20:23:14 fsdet.checkpoint.c2_model_loading]: Remapping C2 weights ......
[09/13 20:23:15 fsdet.checkpoint.c2_model_loading]: Some model parameters are not in the checkpoint:
backbone.fpn_lateral2.{bias, weight}
backbone.fpn_lateral3.{bias, weight}
backbone.fpn_lateral4.{bias, weight}
backbone.fpn_lateral5.{bias, weight}
backbone.fpn_output2.{bias, weight}
backbone.fpn_output3.{bias, weight}
backbone.fpn_output4.{bias, weight}
backbone.fpn_output5.{bias, weight}
proposal_generator.anchor_generator.cell_anchors.{0, 1, 2, 3, 4}
proposal_generator.rpn_head.anchor_deltas.{bias, weight}
proposal_generator.rpn_head.conv.{bias, weight}
proposal_generator.rpn_head.objectness_logits.{bias, weight}
roi_heads.box_head.fc1.{bias, weight}
roi_heads.box_head.fc2.{bias, weight}
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
[09/13 20:23:15 fsdet.checkpoint.c2_model_loading]: The checkpoint contains parameters not used by the model:
fc1000_b
fc1000_w
[09/13 20:23:15 fsdet.engine.train_loop]: Starting training from iteration 0
/home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown
len(cache))
Traceback (most recent call last):
File "tools/train_net.py", line 130, in
args=(args,),
File "/home/wangyufei/Code/FSCE/fsdet/engine/launch.py", line 49, in launch
daemon=False,
File "/home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 107, in join
(error_index, name)
Exception: process 0 terminated with signal SIGSEGV
(FSCE) [wangyufei@node03 FSCE]$ /home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown
len(cache))
Here are the details.
(FSCE) [wangyufei@node03 FSCE]$ CUDA_VISIBLE_DEVICES=8,9 python tools/train_net.py --num-gpus 2 --config-file configs/PASCAL_VOC/base-training/R101_FPN_base_training_split1.yml Command Line Args: Namespace(config_file='configs/PASCAL_VOC/base-training/R101_FPN_base_training_split1.yml', dist_url='tcp://127.0.0.1:50363', end_iter=-1, eval_all=False, eval_during_train=False, eval_iter=-1, eval_only=False, machine_rank=0, num_gpus=2, num_machines=1, opts=[], resume=False, start_iter=-1) [09/13 20:20:43 fsdet]: Rank of current process: 0. World size: 2 [09/13 20:20:43 fsdet]: Command line arguments: Namespace(config_file='configs/PASCAL_VOC/base-training/R101_FPN_base_training_split1.yml', dist_url='tcp://127.0.0.1:50363', end_iter=-1, eval_all=False, eval_during_train=False, eval_iter=-1, eval_only=False, machine_rank=0, num_gpus=2, num_machines=1, opts=[], resume=False, start_iter=-1) [09/13 20:20:43 fsdet]: Contents of args.config_file=configs/PASCAL_VOC/base-training/R101_FPN_base_training_split1.yml: BASE: "../../Base-RCNN-FPN.yaml" MODEL: WEIGHTS: "checkpoints/pretrained_model/R-101.pkl" MASK_ON: False RESNETS: DEPTH: 101 ROI_HEADS: NUM_CLASSES: 15 INPUT: MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800) MIN_SIZE_TEST: 800 DATASETS: TRAIN: ('voc_2007_trainval_base1', 'voc_2012_trainval_base1') TEST: ('voc_2007_test_base1',) SOLVER: STEPS: (12000, 16000) MAX_ITER: 18000 # 17.4 epochs WARMUP_ITERS: 100 OUTPUT_DIR: "checkpoints/voc/faster_rcnn/faster_rcnn_R_101_FPN_base1"
[09/13 20:20:43 fsdet]: Full config saved to /home/wangyufei/Code/FSCE/checkpoints/voc/faster_rcnn/faster_rcnn_R_101_FPN_base1/config.yaml [09/13 20:20:43 fsdet.utils.env]: Using a generated random seed 43966721 frozen resnet backbone stage 2 (this froze ResNet but not FPN, frozen backbone in rcnn.py will overwrite this) frozen resnet backbone stage 2 (this froze ResNet but not FPN, frozen backbone in rcnn.py will overwrite this) -------- Using Roi Head: StandardROIHeads---------
-------- Using Roi Head: StandardROIHeads---------
[09/13 20:22:41 fsdet.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()] [09/13 20:22:41 fsdet.data.build]: Using training sampler TrainingSampler [09/13 20:23:14 fvcore.common.checkpoint]: Loading checkpoint from checkpoints/pretrained_model/R-101.pkl [09/13 20:23:14 fsdet.checkpoint.c2_model_loading]: Remapping C2 weights ...... [09/13 20:23:15 fsdet.checkpoint.c2_model_loading]: Some model parameters are not in the checkpoint: backbone.fpn_lateral2.{bias, weight} backbone.fpn_lateral3.{bias, weight} backbone.fpn_lateral4.{bias, weight} backbone.fpn_lateral5.{bias, weight} backbone.fpn_output2.{bias, weight} backbone.fpn_output3.{bias, weight} backbone.fpn_output4.{bias, weight} backbone.fpn_output5.{bias, weight} proposal_generator.anchor_generator.cell_anchors.{0, 1, 2, 3, 4} proposal_generator.rpn_head.anchor_deltas.{bias, weight} proposal_generator.rpn_head.conv.{bias, weight} proposal_generator.rpn_head.objectness_logits.{bias, weight} roi_heads.box_head.fc1.{bias, weight} roi_heads.box_head.fc2.{bias, weight} roi_heads.box_predictor.bbox_pred.{bias, weight} roi_heads.box_predictor.cls_score.{bias, weight} [09/13 20:23:15 fsdet.checkpoint.c2_model_loading]: The checkpoint contains parameters not used by the model: fc1000_b fc1000_w [09/13 20:23:15 fsdet.engine.train_loop]: Starting training from iteration 0 /home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown len(cache)) Traceback (most recent call last): File "tools/train_net.py", line 130, in
args=(args,),
File "/home/wangyufei/Code/FSCE/fsdet/engine/launch.py", line 49, in launch
daemon=False,
File "/home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 107, in join
(error_index, name)
Exception: process 0 terminated with signal SIGSEGV
(FSCE) [wangyufei@node03 FSCE]$ /home/wangyufei/anaconda3/envs/FSCE/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 20 leaked semaphores to clean up at shutdown
len(cache))
(FSCE) [wangyufei@node03 FSCE]$