ifzhang / FairMOT

[IJCV-2021] FairMOT: On the Fairness of Detection and Re-Identification in Multi-Object Tracking
MIT License
3.96k stars 930 forks source link

RuntimeError: cuda runtime error #195

Open fengchengAI opened 3 years ago

fengchengAI commented 3 years ago
ss@four-master:~/cfeng/FairMOT-master/src$ python3 demo.py mot --load_model ../models/all_dla34.pth --conf_thres 0.4
Fix size testing.
training chunk_sizes: [6, 6]
The output will be saved to  /home/ss/cfeng/FairMOT-master/src/lib/../../exp/mot/default
heads {'hm': 1, 'wh': 2, 'id': 512, 'reg': 2}
2020-07-17 16:38:45 [INFO]: Starting tracking...
Lenth of the video: 1500 frames
Creating model...
loaded ../models/all_dla34.pth, epoch 10
2020-07-17 16:38:48 [INFO]: Processing frame 0 (100000.00 fps)
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=700 : an illegal memory access was encountered
Traceback (most recent call last):
  File "demo.py", line 40, in <module>
    demo(opt)
  File "demo.py", line 30, in demo
    eval_seq(opt, dataloader, 'mot', result_filename, save_dir=frame_dir, show_image=False, frame_rate=frame_rate)
  File "/home/ss/cfeng/FairMOT-master/src/track.py", line 62, in eval_seq
    online_targets = tracker.update(blob, img0)
  File "/home/ss/cfeng/FairMOT-master/src/lib/tracker/multitracker.py", line 241, in update
    output = self.model(im_blob)[-1]
  File "/home/ss/.virtualenvs/Python3.6-fc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ss/cfeng/FairMOT-master/src/lib/models/networks/pose_dla_dcn.py", line 471, in forward
    x = self.dla_up(x)
  File "/home/ss/.virtualenvs/Python3.6-fc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ss/cfeng/FairMOT-master/src/lib/models/networks/pose_dla_dcn.py", line 410, in forward
    ida(layers, len(layers) -i - 2, len(layers))
  File "/home/ss/.virtualenvs/Python3.6-fc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ss/cfeng/FairMOT-master/src/lib/models/networks/pose_dla_dcn.py", line 383, in forward
    layers[i] = upsample(project(layers[i]))
  File "/home/ss/.virtualenvs/Python3.6-fc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ss/cfeng/FairMOT-master/src/lib/models/networks/pose_dla_dcn.py", line 354, in forward
    x = self.conv(x)
  File "/home/ss/.virtualenvs/Python3.6-fc/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ss/cfeng/FairMOT-master/src/lib/models/networks/DCNv2/dcn_v2.py", line 121, in forward
    offset = torch.cat((o1, o2), dim=1)
RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278
Segmentation fault (core dumped)

Who to resolve it?

RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278 Segmentation fault (core dumped)

mahxn0 commented 3 years ago

解决了吗 我也遇到这个问题了 @fengchengAI

mtmoreira98 commented 3 years ago

@mahxn0 could you solve? I also encountered this problem

jravishankar commented 3 years ago

I don't know why this worked for me, but try re-running the DCNv2 shell script (make.sh) before training (I'm training on an AWS EC2 instance).