lab2: gsc 20230226220526_vctree_semantic_sgcls_4GPU_lab1_1e3
also in lab3: 20230226220656_vctree_semantic_predcls_4GPU_lab1_1e3/
no rels_new for rel_og=has
no rels_new for rel_og=with
no rels_new for rel_og=in front of
no rels_new for rel_og=in front of
3209: Augmentation: 1 => 1
3209: Augmentation: 1 => 1
no rels_new for rel_og=on
no rels_new for rel_og=on
no rels_new for rel_og=on
no rels_new for rel_og=with
no rels_new for rel_og=on
no rels_new for rel_og=on
3209: Augmentation: 1 => 1
Traceback (most recent call last):
File "/home/pct4et/gsc/tools/relation_train_net.py", line 665, in <module>
main()
File "/home/pct4et/gsc/tools/relation_train_net.py", line 650, in main
train(cfg, local_rank, args.distributed, logger, experiment)
File "/home/pct4et/gsc/tools/relation_train_net.py", line 327, in train
loss_dict = model(images, targets)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1008, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 969, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 76, in forward
_, result, detector_losses = self.roi_heads(features, proposals, targets, logger, boxes_global=boxes_global)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 69, in forward
x, detections, loss_relation = self.relation(features, detections, targets, logger, boxes_global=boxes_global)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/relation_head/relation_head.py", line 99, in forward
union_features = self.union_feature_extractor(features, proposals, rel_pair_idxs)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/roi_heads/relation_head/roi_relation_feature_extractors.py", line 99, in forward
union_vis_features = self.feature_extractor.pooler(x, union_proposals) # union_proposals: 16 * [651..., 650..., 110...] # union_vis_features: torch.Size([5049, 256, 7, 7]) # TODO: need to borrow pooler's 5 layers to 1 reduction. so have a global union feature pooler
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/pct4et/gsc/maskrcnn_benchmark/modeling/poolers.py", line 142, in forward
assert rois.size(0) > 0
AssertionError
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64810 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64811 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 64812 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 3 (pid: 64813) of binary: /home/pct4et/miniconda3/envs/gsc/bin/python
Traceback (most recent call last):
File "/home/pct4et/miniconda3/envs/gsc/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.12.1', 'console_scripts', 'torchrun')())
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/pct4et/miniconda3/envs/gsc/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
lab2: gsc 20230226220526_vctree_semantic_sgcls_4GPU_lab1_1e3 also in lab3: 20230226220656_vctree_semantic_predcls_4GPU_lab1_1e3/