researchmm / TracKit

[ECCV'20] Ocean: Object-aware Anchor-Free Tracking
MIT License
613 stars 97 forks source link

运行train_ocean.py卡住不继续了 #59

Closed interflow-miao closed 3 years ago

interflow-miao commented 3 years ago

作者您好,我在运行train_ocean.py的时候,代码执行到/lib/core/function中的测量数据加载时间模块 for iter, input in enumerate(train_loader):这一行就不再继续了,没有报错信息,日志也没有继续更新。 完整日志信息如下(我不太理解这里第一行的意思,为什么gpus和workers都是None?) 2020-12-23 16:30:27,650 Namespace(cfg='experiments/train/Ocean.yaml', gpus=None, workers=None) 2020-12-23 16:30:27,652 {'CHECKPOINT_DIR': 'snapshot', 'GPUS': '1', 'OCEAN': {'DATASET': {'BLUR': 0, 'CHANNEL6': 0, 'COCO': {'ANNOTATION': './data/coco/train2014.json', 'PATH': './data/coco/crop511', 'RANGE': 1, 'USE': 60000}, 'COLOR': 1, 'CUTOUT': 0, 'DET': {'ANNOTATION': './data/ILSVRC2015/DET/train.json', 'PATH': './data/ILSVRC2015/DET', 'RANGE': 100, 'USE': 60000}, 'FLIP': 0, 'GOT10K': {'ANNOTATION': './data/got10k/train.json', 'PATH': './data/got10k/crop511', 'RANGE': 100, 'USE': 160000}, 'GRAY': 0, 'LABELSMOOTH': False, 'LASOT': {}, 'MIXUP': 0, 'ROTATION': 0, 'SCALE': 0.05, 'SCALEs': 0.18, 'SHIFT': 4, 'SHIFTs': 64, 'VID': {'ANNOTATION': './data/ILSVRC2015/VID/train.json', 'PATH': './data/ILSVRC2015/VID/crop511', 'RANGE': 100, 'USE': 110000}, 'VISDRONE': {'ANNOTATION': '$data_path/visdrone/train.json', 'PATH': '$data_path/visdrone/crop271', 'RANGE': 100, 'USE': 100000}, 'YTB': {'ANNOTATION': './data/y2b/train.json', 'PATH': './data/y2b/crop511', 'RANGE': 3, 'USE': 210000}}, 'TEST': {'DATA': 'VOT2019', 'END_EPOCH': 50, 'ISTRUE': False, 'MODEL': 'Ocean', 'RGBTSPLIT': 'None', 'START_EPOCH': 30, 'THREADS': 16}, 'TRAIN': {'ALIGN': True, 'BASE_LR': 0.005, 'BATCH': 16, 'END_EPOCH': 50, 'EXID': 'setting1', 'GROUP': 'resrchvc', 'ISTRUE': True, 'LAYERS_LR': 0.1, 'LR': {'KWARGS': {'end_lr': 1e-05, 'start_lr': 0.005}, 'TYPE': 'log'}, 'LR_END': 1e-05, 'LR_POLICY': 'log', 'MODEL': 'Ocean', 'MOMENTUM': 0.9, 'PRETRAIN': 'pretrain.model', 'RESUME': False, 'SEARCH_SIZE': 255, 'START_EPOCH': 0, 'STRIDE': 8, 'TEMPLATE_SIZE': 127, 'TRAINABLE_LAYER': ['layer1', 'layer2', 'layer3'], 'UNFIX_EPOCH': 10, 'UNFIX_POLICY': 'log', 'WARMUP': {'EPOCH': 5, 'IFNOT': True, 'KWARGS': {'end_lr': 0.005, 'start_lr': 0.001, 'step': 1}, 'TYPE': 'step'}, 'WARM_POLICY': 'step', 'WEIGHT_DECAY': 0.0001, 'WHICH_USE': ['VID', 'COCO', 'DET', 'GOT10K']}, 'TUNE': {'DATA': 'VOT2019', 'ISTRUE': True, 'METHOD': 'TPE', 'MODEL': 'Ocean', 'RGBTSPLT': 'None'}}, 'OUTPUT_DIR': 'logs', 'PRINT_FREQ': 10, 'WORKERS': 2} 2020-12-23 16:30:29,991 trainable params: 2020-12-23 16:30:29,991 neck.downsample.0.weight 2020-12-23 16:30:29,991 neck.downsample.1.weight 2020-12-23 16:30:29,991 neck.downsample.1.bias 2020-12-23 16:30:29,991 connect_model.adjust 2020-12-23 16:30:29,991 connect_model.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.0.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_dw.weight 2020-12-23 16:30:29,992 connect_model.reg_dw.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.0.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.0.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.1.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.1.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.3.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.3.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.4.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.4.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.6.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.6.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.7.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.7.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.9.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.9.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.10.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.10.bias 2020-12-23 16:30:29,993 connect_model.bbox_pred.weight 2020-12-23 16:30:29,993 connect_model.bbox_pred.bias 2020-12-23 16:30:29,993 connect_model.cls_pred.weight 2020-12-23 16:30:29,993 connect_model.cls_pred.bias 2020-12-23 16:30:29,993 align_head.rpn_conv.conv.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.bias 2020-12-23 16:30:29,993 GPU NUM: 1 2020-12-23 16:30:31,904 (WarmUPScheduler) lr spaces: [1.00000000e-03 1.37972966e-03 1.90365394e-03 2.62652780e-03 3.62389832e-03 5.00000000e-03 4.34139975e-03 3.76955036e-03 3.27302500e-03 2.84190198e-03 2.46756651e-03 2.14253853e-03 1.86032325e-03 1.61528137e-03 1.40251643e-03 1.21777690e-03 1.05737126e-03 9.18094268e-04 7.97162845e-04 6.92160515e-04 6.00989098e-04 5.21826784e-04 4.53091734e-04 3.93410468e-04 3.41590422e-04 2.96596114e-04 2.57528459e-04 2.23606798e-04 1.94153299e-04 1.68579417e-04 1.46374128e-04 1.27093720e-04 1.10352929e-04 9.58172358e-05 8.31961847e-05 7.22375791e-05 6.27224416e-05 5.44606385e-05 4.72870805e-05 4.10584239e-05 3.56502062e-05 3.09543593e-05 2.68770495e-05 2.33368032e-05 2.02628783e-05 1.75938510e-05 1.52763881e-05 1.32641815e-05 1.15170228e-05 1.00000000e-05] 2020-12-23 16:30:31,905 model prepare done

之后代码一直没有动静,我手动结束(ctrl C)以后日志如下

File "tracking/train_ocean.py", line 259, in main() File "tracking/train_ocean.py", line 250, in main model, writer_dict = ocean_train(train_loader, model, optimizer, epoch + 1, curLR, config, writer_dict, logger, device=device) File "/home/miaost/tracking/TracKit-master/tracking/../lib/core/function.py", line 26, in ocean_train

measure data loading time

File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 576, in next idx, batch = self._get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 543, in _get_batch success, data = self._try_get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 511, in _try_get_batch data = self.data_queue.get(timeout=timeout) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/queue.py", line 179, in get self.not_empty.wait(remaining) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/threading.py", line 300, in wait gotit = waiter.acquire(True, timeout) KeyboardInterrupt

请问可能是哪里出了问题呢,运行test_ocean是没有问题的,期待您的回复。非常感谢~

JudasDie commented 3 years ago

作者您好,我在运行train_ocean.py的时候,代码执行到/lib/core/function中的测量数据加载时间模块 for iter, input in enumerate(train_loader):这一行就不再继续了,没有报错信息,日志也没有继续更新。 完整日志信息如下(我不太理解这里第一行的意思,为什么gpus和workers都是None?) 2020-12-23 16:30:27,650 Namespace(cfg='experiments/train/Ocean.yaml', gpus=None, workers=None) 2020-12-23 16:30:27,652 {'CHECKPOINT_DIR': 'snapshot', 'GPUS': '1', 'OCEAN': {'DATASET': {'BLUR': 0, 'CHANNEL6': 0, 'COCO': {'ANNOTATION': './data/coco/train2014.json', 'PATH': './data/coco/crop511', 'RANGE': 1, 'USE': 60000}, 'COLOR': 1, 'CUTOUT': 0, 'DET': {'ANNOTATION': './data/ILSVRC2015/DET/train.json', 'PATH': './data/ILSVRC2015/DET', 'RANGE': 100, 'USE': 60000}, 'FLIP': 0, 'GOT10K': {'ANNOTATION': './data/got10k/train.json', 'PATH': './data/got10k/crop511', 'RANGE': 100, 'USE': 160000}, 'GRAY': 0, 'LABELSMOOTH': False, 'LASOT': {}, 'MIXUP': 0, 'ROTATION': 0, 'SCALE': 0.05, 'SCALEs': 0.18, 'SHIFT': 4, 'SHIFTs': 64, 'VID': {'ANNOTATION': './data/ILSVRC2015/VID/train.json', 'PATH': './data/ILSVRC2015/VID/crop511', 'RANGE': 100, 'USE': 110000}, 'VISDRONE': {'ANNOTATION': '$data_path/visdrone/train.json', 'PATH': '$data_path/visdrone/crop271', 'RANGE': 100, 'USE': 100000}, 'YTB': {'ANNOTATION': './data/y2b/train.json', 'PATH': './data/y2b/crop511', 'RANGE': 3, 'USE': 210000}}, 'TEST': {'DATA': 'VOT2019', 'END_EPOCH': 50, 'ISTRUE': False, 'MODEL': 'Ocean', 'RGBTSPLIT': 'None', 'START_EPOCH': 30, 'THREADS': 16}, 'TRAIN': {'ALIGN': True, 'BASE_LR': 0.005, 'BATCH': 16, 'END_EPOCH': 50, 'EXID': 'setting1', 'GROUP': 'resrchvc', 'ISTRUE': True, 'LAYERS_LR': 0.1, 'LR': {'KWARGS': {'end_lr': 1e-05, 'start_lr': 0.005}, 'TYPE': 'log'}, 'LR_END': 1e-05, 'LR_POLICY': 'log', 'MODEL': 'Ocean', 'MOMENTUM': 0.9, 'PRETRAIN': 'pretrain.model', 'RESUME': False, 'SEARCH_SIZE': 255, 'START_EPOCH': 0, 'STRIDE': 8, 'TEMPLATE_SIZE': 127, 'TRAINABLE_LAYER': ['layer1', 'layer2', 'layer3'], 'UNFIX_EPOCH': 10, 'UNFIX_POLICY': 'log', 'WARMUP': {'EPOCH': 5, 'IFNOT': True, 'KWARGS': {'end_lr': 0.005, 'start_lr': 0.001, 'step': 1}, 'TYPE': 'step'}, 'WARM_POLICY': 'step', 'WEIGHT_DECAY': 0.0001, 'WHICH_USE': ['VID', 'COCO', 'DET', 'GOT10K']}, 'TUNE': {'DATA': 'VOT2019', 'ISTRUE': True, 'METHOD': 'TPE', 'MODEL': 'Ocean', 'RGBTSPLT': 'None'}}, 'OUTPUT_DIR': 'logs', 'PRINT_FREQ': 10, 'WORKERS': 2} 2020-12-23 16:30:29,991 trainable params: 2020-12-23 16:30:29,991 neck.downsample.0.weight 2020-12-23 16:30:29,991 neck.downsample.1.weight 2020-12-23 16:30:29,991 neck.downsample.1.bias 2020-12-23 16:30:29,991 connect_model.adjust 2020-12-23 16:30:29,991 connect_model.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.0.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_dw.weight 2020-12-23 16:30:29,992 connect_model.reg_dw.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.0.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.0.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.1.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.1.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.3.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.3.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.4.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.4.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.6.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.6.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.7.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.7.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.9.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.9.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.10.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.10.bias 2020-12-23 16:30:29,993 connect_model.bbox_pred.weight 2020-12-23 16:30:29,993 connect_model.bbox_pred.bias 2020-12-23 16:30:29,993 connect_model.cls_pred.weight 2020-12-23 16:30:29,993 connect_model.cls_pred.bias 2020-12-23 16:30:29,993 align_head.rpn_conv.conv.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.bias 2020-12-23 16:30:29,993 GPU NUM: 1 2020-12-23 16:30:31,904 (WarmUPScheduler) lr spaces: [1.00000000e-03 1.37972966e-03 1.90365394e-03 2.62652780e-03 3.62389832e-03 5.00000000e-03 4.34139975e-03 3.76955036e-03 3.27302500e-03 2.84190198e-03 2.46756651e-03 2.14253853e-03 1.86032325e-03 1.61528137e-03 1.40251643e-03 1.21777690e-03 1.05737126e-03 9.18094268e-04 7.97162845e-04 6.92160515e-04 6.00989098e-04 5.21826784e-04 4.53091734e-04 3.93410468e-04 3.41590422e-04 2.96596114e-04 2.57528459e-04 2.23606798e-04 1.94153299e-04 1.68579417e-04 1.46374128e-04 1.27093720e-04 1.10352929e-04 9.58172358e-05 8.31961847e-05 7.22375791e-05 6.27224416e-05 5.44606385e-05 4.72870805e-05 4.10584239e-05 3.56502062e-05 3.09543593e-05 2.68770495e-05 2.33368032e-05 2.02628783e-05 1.75938510e-05 1.52763881e-05 1.32641815e-05 1.15170228e-05 1.00000000e-05] 2020-12-23 16:30:31,905 model prepare done

之后代码一直没有动静,我手动结束(ctrl C)以后日志如下

File "tracking/train_ocean.py", line 259, in main() File "tracking/train_ocean.py", line 250, in main model, writer_dict = ocean_train(train_loader, model, optimizer, epoch + 1, curLR, config, writer_dict, logger, device=device) File "/home/miaost/tracking/TracKit-master/tracking/../lib/core/function.py", line 26, in ocean_train

measure data loading time

File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 576, in next idx, batch = self._get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 543, in _get_batch success, data = self._try_get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 511, in _try_get_batch data = self.data_queue.get(timeout=timeout) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/queue.py", line 179, in get self.not_empty.wait(remaining) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/threading.py", line 300, in wait gotit = waiter.acquire(True, timeout) KeyboardInterrupt

请问可能是哪里出了问题呢,运行test_ocean是没有问题的,期待您的回复。非常感谢~

你好,你先试试workers和batch调小,如果还不行就试试pycharm debug看看有没有问题,如果再不行应该就是硬件问题了

interflow-miao commented 3 years ago

好的那我再试试。非常感谢!!

------------------ 原始邮件 ------------------ 发件人: "researchmm/TracKit" <notifications@github.com>; 发送时间: 2020年12月26日(星期六) 下午3:09 收件人: "researchmm/TracKit"<TracKit@noreply.github.com>; 抄送: "我就想改个昵称"<961339768@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [researchmm/TracKit] 运行train_ocean.py卡住不继续了 (#59)

作者您好,我在运行train_ocean.py的时候,代码执行到/lib/core/function中的测量数据加载时间模块 for iter, input in enumerate(train_loader):这一行就不再继续了,没有报错信息,日志也没有继续更新。 完整日志信息如下(我不太理解这里第一行的意思,为什么gpus和workers都是None?) 2020-12-23 16:30:27,650 Namespace(cfg='experiments/train/Ocean.yaml', gpus=None, workers=None) 2020-12-23 16:30:27,652 {'CHECKPOINT_DIR': 'snapshot', 'GPUS': '1', 'OCEAN': {'DATASET': {'BLUR': 0, 'CHANNEL6': 0, 'COCO': {'ANNOTATION': './data/coco/train2014.json', 'PATH': './data/coco/crop511', 'RANGE': 1, 'USE': 60000}, 'COLOR': 1, 'CUTOUT': 0, 'DET': {'ANNOTATION': './data/ILSVRC2015/DET/train.json', 'PATH': './data/ILSVRC2015/DET', 'RANGE': 100, 'USE': 60000}, 'FLIP': 0, 'GOT10K': {'ANNOTATION': './data/got10k/train.json', 'PATH': './data/got10k/crop511', 'RANGE': 100, 'USE': 160000}, 'GRAY': 0, 'LABELSMOOTH': False, 'LASOT': {}, 'MIXUP': 0, 'ROTATION': 0, 'SCALE': 0.05, 'SCALEs': 0.18, 'SHIFT': 4, 'SHIFTs': 64, 'VID': {'ANNOTATION': './data/ILSVRC2015/VID/train.json', 'PATH': './data/ILSVRC2015/VID/crop511', 'RANGE': 100, 'USE': 110000}, 'VISDRONE': {'ANNOTATION': '$data_path/visdrone/train.json', 'PATH': '$data_path/visdrone/crop271', 'RANGE': 100, 'USE': 100000}, 'YTB': {'ANNOTATION': './data/y2b/train.json', 'PATH': './data/y2b/crop511', 'RANGE': 3, 'USE': 210000}}, 'TEST': {'DATA': 'VOT2019', 'END_EPOCH': 50, 'ISTRUE': False, 'MODEL': 'Ocean', 'RGBTSPLIT': 'None', 'START_EPOCH': 30, 'THREADS': 16}, 'TRAIN': {'ALIGN': True, 'BASE_LR': 0.005, 'BATCH': 16, 'END_EPOCH': 50, 'EXID': 'setting1', 'GROUP': 'resrchvc', 'ISTRUE': True, 'LAYERS_LR': 0.1, 'LR': {'KWARGS': {'end_lr': 1e-05, 'start_lr': 0.005}, 'TYPE': 'log'}, 'LR_END': 1e-05, 'LR_POLICY': 'log', 'MODEL': 'Ocean', 'MOMENTUM': 0.9, 'PRETRAIN': 'pretrain.model', 'RESUME': False, 'SEARCH_SIZE': 255, 'START_EPOCH': 0, 'STRIDE': 8, 'TEMPLATE_SIZE': 127, 'TRAINABLE_LAYER': ['layer1', 'layer2', 'layer3'], 'UNFIX_EPOCH': 10, 'UNFIX_POLICY': 'log', 'WARMUP': {'EPOCH': 5, 'IFNOT': True, 'KWARGS': {'end_lr': 0.005, 'start_lr': 0.001, 'step': 1}, 'TYPE': 'step'}, 'WARM_POLICY': 'step', 'WEIGHT_DECAY': 0.0001, 'WHICH_USE': ['VID', 'COCO', 'DET', 'GOT10K']}, 'TUNE': {'DATA': 'VOT2019', 'ISTRUE': True, 'METHOD': 'TPE', 'MODEL': 'Ocean', 'RGBTSPLT': 'None'}}, 'OUTPUT_DIR': 'logs', 'PRINT_FREQ': 10, 'WORKERS': 2} 2020-12-23 16:30:29,991 trainable params: 2020-12-23 16:30:29,991 neck.downsample.0.weight 2020-12-23 16:30:29,991 neck.downsample.1.weight 2020-12-23 16:30:29,991 neck.downsample.1.bias 2020-12-23 16:30:29,991 connect_model.adjust 2020-12-23 16:30:29,991 connect_model.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.0.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.weight 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_k.1.bias 2020-12-23 16:30:29,991 connect_model.cls_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.cls_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix11_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix12_s.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_k.1.bias 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.0.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.weight 2020-12-23 16:30:29,992 connect_model.reg_encode.matrix21_s.1.bias 2020-12-23 16:30:29,992 connect_model.cls_dw.weight 2020-12-23 16:30:29,992 connect_model.reg_dw.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.0.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.1.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.3.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.4.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.6.bias 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.weight 2020-12-23 16:30:29,992 connect_model.bbox_tower.7.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.9.bias 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.weight 2020-12-23 16:30:29,993 connect_model.bbox_tower.10.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.0.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.0.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.1.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.1.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.3.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.3.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.4.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.4.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.6.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.6.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.7.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.7.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.9.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.9.bias 2020-12-23 16:30:29,993 connect_model.cls_tower.10.weight 2020-12-23 16:30:29,993 connect_model.cls_tower.10.bias 2020-12-23 16:30:29,993 connect_model.bbox_pred.weight 2020-12-23 16:30:29,993 connect_model.bbox_pred.bias 2020-12-23 16:30:29,993 connect_model.cls_pred.weight 2020-12-23 16:30:29,993 connect_model.cls_pred.bias 2020-12-23 16:30:29,993 align_head.rpn_conv.conv.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.weight 2020-12-23 16:30:29,993 align_head.rpn_cls.bias 2020-12-23 16:30:29,993 GPU NUM: 1 2020-12-23 16:30:31,904 (WarmUPScheduler) lr spaces: [1.00000000e-03 1.37972966e-03 1.90365394e-03 2.62652780e-03 3.62389832e-03 5.00000000e-03 4.34139975e-03 3.76955036e-03 3.27302500e-03 2.84190198e-03 2.46756651e-03 2.14253853e-03 1.86032325e-03 1.61528137e-03 1.40251643e-03 1.21777690e-03 1.05737126e-03 9.18094268e-04 7.97162845e-04 6.92160515e-04 6.00989098e-04 5.21826784e-04 4.53091734e-04 3.93410468e-04 3.41590422e-04 2.96596114e-04 2.57528459e-04 2.23606798e-04 1.94153299e-04 1.68579417e-04 1.46374128e-04 1.27093720e-04 1.10352929e-04 9.58172358e-05 8.31961847e-05 7.22375791e-05 6.27224416e-05 5.44606385e-05 4.72870805e-05 4.10584239e-05 3.56502062e-05 3.09543593e-05 2.68770495e-05 2.33368032e-05 2.02628783e-05 1.75938510e-05 1.52763881e-05 1.32641815e-05 1.15170228e-05 1.00000000e-05] 2020-12-23 16:30:31,905 model prepare done

之后代码一直没有动静,我手动结束(ctrl C)以后日志如下

File "tracking/train_ocean.py", line 259, in main() File "tracking/train_ocean.py", line 250, in main model, writer_dict = ocean_train(train_loader, model, optimizer, epoch + 1, curLR, config, writer_dict, logger, device=device) File "/home/miaost/tracking/TracKit-master/tracking/../lib/core/function.py", line 26, in ocean_train

measure data loading time

File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 576, in next idx, batch = self._get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 543, in _get_batch success, data = self._try_get_batch() File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 511, in _try_get_batch data = self.data_queue.get(timeout=timeout) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/queue.py", line 179, in get self.not_empty.wait(remaining) File "/home/miaost/anaconda3/envs/TracKit/lib/python3.7/threading.py", line 300, in wait gotit = waiter.acquire(True, timeout) KeyboardInterrupt

请问可能是哪里出了问题呢,运行test_ocean是没有问题的,期待您的回复。非常感谢~

你好,你先试试workers和batch调小,如果还不行就试试pycharm debug看看有没有问题,如果再不行应该就是硬件问题了

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.