votchallenge / toolkit

The official VOT Challenge evaluation and analysis toolkit
http://www.votchallenge.net/
GNU General Public License v3.0
153 stars 43 forks source link

Timeout reached, terminating tracker #69

Closed Yioutpi closed 1 year ago

Yioutpi commented 1 year ago

Dear vot commit, I run vot evaluate in vot2022rbgd stack, but it abruptly stopped on the sequence. 142:vot-toolkit 0.5.3 143:vot-trax 3.0.3 log:

test config: {'MODEL': {'HEAD_TYPE': 'CORNER', 'MERGE_TYPE': 'max', 'HIDDEN_DIM': 1024, 'NUM_OBJECT_QUERIES': 1, 'POSITION_EMBEDDING': 'sine', 'PREDICT_MASK': False, 'BACKBONE': {'PRETRAINED': False, 'PRETRAINED_PATH': ~/tracking/pretrain/CvT-w24-384x384-IN-22k.pth', 'INIT': 'trunc_norm', 'NUM_STAGES': 3, 'PATCH_SIZE': [7, 3, 3], 'PATCH_STRIDE': [4, 2, 2], 'PATCH_PADDING': [2, 1, 1], 'DIM_EMBED': [192, 768, 1024], 'NUM_HEADS': [3, 12, 16], 'DEPTH': [2, 2, 12], 'MLP_RATIO': [4.0, 4.0, 4.0], 'ATTN_DROP_RATE': [0.0, 0.0, 0.0], 'DROP_RATE': [0.0, 0.0, 0.0], 'DROP_PATH_RATE': [0.0, 0.0, 0.3], 'QKV_BIAS': [True, True, True], 'CLS_TOKEN': [False, False, False], 'POS_EMBED': [False, False, False], 'QKV_PROJ_METHOD': ['dw_bn', 'dw_bn', 'dw_bn'], 'KERNEL_QKV': [3, 3, 3], 'PADDING_KV': [1, 1, 1], 'STRIDE_KV': [2, 2, 2], 'PADDING_Q': [1, 1, 1], 'STRIDE_Q': [1, 1, 1], 'FREEZE_BN': True}, 'PRETRAINED_STAGE1': True, 'NLAYER_HEAD': 3, 'HEAD_FREEZE_BN': True}, 'TRAIN': {'TRAIN_SCORE': True, 'SCORE_WEIGHT': 1.0, 'LR': 5e-05, 'WEIGHT_DECAY': 0.0001, 'EPOCH': 30, 'LR_DROP_EPOCH': 20, 'BATCH_SIZE': 32, 'NUM_WORKER': 8, 'OPTIMIZER': 'ADAMW', 'BACKBONE_MULTIPLIER': 0.1, 'GIOU_WEIGHT': 2.0, 'L1_WEIGHT': 5.0, 'DEEP_SUPERVISION': False, 'FREEZE_STAGE0': False, 'PRINT_INTERVAL': 50, 'VAL_EPOCH_INTERVAL': 5, 'GRAD_CLIP_NORM': 0.1, 'SCHEDULER': {'TYPE': 'step', 'DECAY_RATE': 0.1}}, 'DATA': {'SAMPLER_MODE': 'trident_pro', 'MEAN': [0.485, 0.456, 0.406], 'STD': [0.229, 0.224, 0.225], 'MAX_SAMPLE_INTERVAL': [200], 'TRAIN': {'DATASETS_NAME': ['COCO17_Depth', 'LASOT_Depth', 'DepthTrack_train', 'GOT10K_Depth'], 'DATASETS_RATIO': [1, 1, 1, 1], 'SAMPLE_PER_EPOCH': 60000}, 'VAL': {'DATASETS_NAME': ['DepthTrack_val'], 'DATASETS_RATIO': [1], 'SAMPLE_PER_EPOCH': 10000}, 'SEARCH': {'SIZE': 320, 'FACTOR': 5.0, 'CENTER_JITTER': 4.5, 'SCALE_JITTER': 0.5}, 'TEMPLATE': {'SIZE': 128, 'FACTOR': 2.0, 'NUMBER': 2, 'CENTER_JITTER': 0, 'SCALE_JITTER': 0}}, 'TEST': {'RE_CONSTRAIN_TYPE': 'simple', 'MAX_SCORE_DECAY': 0.98, 'TEMPLATE_FACTOR': 2.0, 'TEMPLATE_SIZE': 128, 'SEARCH_FACTOR': 5.0, 'SEARCH_SIZE': 320, 'EPOCH': 40, 'UPDATE_INTERVALS': {'LASOT': [200], 'GOT10K_TEST': [10], 'TRACKINGNET': [25], 'VOT20': [10], 'VOT20LT': [200], 'OTB': [6], 'UAV': [200], 'VOT2022RGBD': [10]}, 'ONLINE_SIZES': {'LASOT': [2], 'GOT10K_TEST': [2], 'TRACKINGNET': [1], 'VOT20': [5], 'VOT20LT': [3], 'OTB': [3], 'UAV': [1], 'VOT2022RGBD': [5]}}} search_area_scale: 5.0 head channel: 384 Online size is: 5 Update interval is: 10 max score decay = 0.98 @@TRAX:hello "trax.name=" "trax.family=" "trax.image=path;" "trax.region=rectangle;" "trax.description=" "trax.version=3" "vot=python" "trax.channels=color;depth;" @@TRAX:initialize "file://~/sequences/box_darkroom_noocc_5_1/color/00000101.jpg" "file://~/sequences/box_darkroom_noocc_5_1/depth/00000101.png" "416.0000,268.0000,105.0000,82.0000" @@TRAX:state "416.0000,268.0000,105.0000,82.0000" @@TRAX:frame "file://~/sequences/box_darkroom_noocc_5_1/color/00000102.jpg" "file://~/sequences/box_darkroom_noocc_5_1/depth/00000102.png" Using ~/.cache/torch_extensions/py38_cu113 as PyTorch extensions root... @@TRAX:quit

Process exited with code (-15)

Terminal report is: Timeout reached, terminating tracker Tracker mixformerrgbd_large encountered an error: Tracker interrupted, it did not reply in 30 seconds.

But I tried to set larger , but the error still existed.

lukacu commented 1 year ago

Well for this error message the issue is a timeout. Why this timeout occurs is hard to say. It could be that loading a model takes too long, in this case increasing the timeout helps. But there can be other problems, perhaps try putting more debug messages in your code to see what is going on.

From your messages I would assume that some loading is happening on the second frame of the sequence, so you may want to investigate that.

Yioutpi commented 1 year ago

Thanks for your reply. I debug and found the error. It indeed occurred when loading on the second frame of the sequence. The question is that something was locked under the dir "~/.cache/torch_extensions/_prroi_pooling/ ". I delete this dir, it works again!

ManOfStory commented 7 months ago

Thanks for your reply. I debug and found the error. It indeed occurred when loading on the second frame of the sequence. The question is that something was locked under the dir "~/.cache/torch_extensions/_prroi_pooling/ ". I delete this dir, it works again!

I met the same error, but It did not work even if I delete the dir '~/.cache/torch_extensions/_prroi_pooling/', when I restart the vot eval again, a new lock file appears under the prpool dir. Have you met this error? Thank you.

log.txt `

@@TRAX:hello "trax.name=" "trax.family=" "trax.image=path;" "trax.region=mask;" "trax.description=" "trax.version=3" "vot=python" "trax.channels=color;" @@TRAX:initialize "file://~/Projects/VOT-TOOLKIT/vot/workspace/sequences/agility/color/00000001.jpg" "mask:549,249,24,58,1,3,20,5,1,5,13,12,12,14,10,15,10,15,10,15,9,16,8,16,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,16,9,15,10,14,10,14,10,15,9,15,9,15,8,19,5,20,122,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,9,2,11,2,9,3,10,2,9,3,10,2,9,4,9,2,9,4,9,2,9,5,8,2,9,5,8,3,8,5,8,3,8,5,8,3,7,6,8,3,7,6,7,4,6,8,5,5,6,18,5,20,3" @@TRAX:state "mask:549,249,24,58,1,3,20,5,1,5,13,12,12,14,10,15,10,15,10,15,9,16,8,16,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,15,9,16,9,15,10,14,10,14,10,15,9,15,9,15,8,19,5,20,122,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,22,2,9,2,11,2,9,3,10,2,9,3,10,2,9,4,9,2,9,4,9,2,9,5,8,2,9,5,8,3,8,5,8,3,8,5,8,3,7,6,8,3,7,6,7,4,6,8,5,5,6,18,5,20,3" @@TRAX:frame "file://~/Projects/VOT-TOOLKIT/vot/workspace/sequences/agility/color/00000002.jpg"

Using ~/.cache/torch_extensions as PyTorch extensions root... Creating extension directory ~/.cache/torch_extensions/_prroi_pooling...

Detected CUDA files, patching ldflags Emitting ninja build file ~/.cache/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/TH -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem ~/anaconda3/envs/environment/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -std=c++14 -c ~/external/AR/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu_impl.cu -o prroi_pooling_gpu_impl.cuda.o @@TRAX:quit [2/3] c++ -MMD -MF prroi_pooling_gpu.o.d -DTORCH_EXTENSION_NAME=_prroi_pooling -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/TH -isystem ~/anaconda3/envs/environment/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem ~/anaconda3/envs/environment/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c ~/external/AR/ltr/external/PreciseRoIPooling/pytorch/prroi_pool/src/prroi_pooling_gpu.c -o prroi_pooling_gpu.o

Process exited with code (None) `