chenyuntc / simple-faster-rcnn-pytorch

A simplified implemention of Faster R-CNN that replicate performance from origin paper
Other
4k stars 1.14k forks source link

when I train on my own dataset, it has an error that 'RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:21' #28

Open Bigwode opened 6 years ago

Bigwode commented 6 years ago

======user config======== {'caffe_pretrain': False, 'caffe_pretrain_path': '/vgg16_caffe.pth', 'data': 'voc', 'debug_file': '/tmp/debugf', 'env': 'fasterrcnn-caffe', 'epoch': 14, 'load_path': None, 'lr': 0.001, 'lr_decay': 0.1, 'max_size': 1000, 'min_size': 600, 'num_workers': 8, 'plot_every': 100, 'port': 8097, 'pretrained_model': 'vgg16', 'roi_sigma': 1.0, 'rpn_sigma': 3.0, 'test_num': 10000, 'test_num_workers': 8, 'use_adam': False, 'use_chainer': False, 'use_drop': False, 'voc_data_dir': '/home/chenzw/tensor/Faster-rcnn-hoi/hico_20160224_det/', 'weight_decay': 0.0005} ==========end============ loading dataset model construct completed 1it [00:01, 1.78s/it]/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [36,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [37,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [38,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [39,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [40,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [41,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [42,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [43,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [60,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [61,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [62,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [63,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [96,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [97,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [98,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [99,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [100,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [101,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [102,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [103,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [104,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [105,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [106,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [107,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [108,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [109,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [110,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [111,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [112,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [113,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [114,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [115,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [116,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [117,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [118,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [119,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [120,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [121,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [122,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [123,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [0,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [1,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [2,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [3,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [4,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [5,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [6,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [7,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [8,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [9,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [10,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [11,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [12,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [13,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [14,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [15,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [28,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [29,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [30,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [72,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [73,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [74,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [75,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [76,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [77,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [78,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [79,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [80,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [81,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [82,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [83,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [84,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [85,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [86,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [87,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [92,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [93,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [94,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. /pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [95,0,0] Assertion indexAtDim < data.baseSizes[dim] failed. THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCTensorCopy.c line=21 error=59 : device-side assert triggered

Traceback (most recent call last): File "train3.py", line 137, in fire.Fire() File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "train3.py", line 84, in train trainer.train_step(img, bbox, label, scale) File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 168, in train_step losses = self.forward(imgs, bboxes, labels, scale) File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 148, in forward gt_roi_label = at.tovariable(gt_roi_label).long() File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 31, in tovariable return tovariable(totensor(data)) File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 25, in totensor tensor = tensor.cuda() File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/_utils.py", line 69, in _cuda return newtype(self.size()).copy(self, async) RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:21

If you suspect this is an IPython bug, please report it at: https://github.com/ipython/ipython/issues or send an email to the mailing list at ipython-dev@python.org

You can print a more detailed traceback right now with "%tb", or use "%debug" to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via: %config Application.verbose_crash=True

chenyuntc commented 6 years ago

export CUDA_LAUNCH_BLOCKING = 1
然后再运行程序看看?

Bigwode commented 6 years ago

@chenyuntc 我之前上网搜了这个问题,但是运行之后还是显示出错,但是错误不同 ======user config======== {'caffe_pretrain': True, 'caffe_pretrain_path': '/home/chenzw/tensor/SIM-Faster-rcnn-pytorch/vgg16_caffe.pth', 'data': 'voc', 'debug_file': '/tmp/debugf', 'env': 'fasterrcnn-caffe', 'epoch': 14, 'load_path': None, 'lr': 0.001, 'lr_decay': 0.1, 'max_size': 1000, 'min_size': 600, 'num_workers': 8, 'plot_every': 100, 'port': 8097, 'pretrained_model': 'vgg16', 'roi_sigma': 1.0, 'rpn_sigma': 3.0, 'test_num': 10000, 'test_num_workers': 8, 'use_adam': False, 'use_chainer': False, 'use_drop': False, 'voc_data_dir': '/home/chenzw/tensor/SIM-Faster-rcnn-pytorch/hico_20160224_det/', 'weight_decay': 0.0005} ==========end============ loading dataset model construct completed THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "train3.py", line 136, in fire.Fire() # 自动生成命令行工具,存在多个函数时需加上函数名 File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "train3.py", line 69, in train trainer = FasterRCNNTrainer(faster_rcnn).cuda() File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 216, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 146, in _apply module._apply(fn) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 152, in _apply param.data = fn(param.data) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/nn/modules/module.py", line 216, in return self._apply(lambda t: t.cuda(device)) File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/_utils.py", line 69, in _cuda return newtype(self.size()).copy(self, async) RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

chenyuntc commented 6 years ago

貌似是显存泄漏

RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

Bigwode commented 6 years ago

@chenyuntc 两块titan xp ..... 小白改别人的程序果然会出现各种各样奇怪的问题 现在export CUDA_LAUNCH_BLOCKING = 1也不行了,还是报最开始的那个错误

penguinshin commented 6 years ago

I am also having this issue.

kingxueyuf commented 6 years ago

Me too, does anyone has any solution for this?

penguinshin commented 6 years ago

It might have something to do with hard-coded n_fg_class which is set to 20. If you have a different number of classes, you need to change that number or abstract it out as an input

kingxueyuf commented 6 years ago

problem solved, it is due to my local CUDA or CuDNN installed improperly. For ppl who might meet the same issue in the future, always do following

1) check your local CUDA and CuDNN is installed properly via https://docs.nvidia.com/deeplearning/sdk/cudnn-install/#verify "2.4. Verifying" part ONLY

2) Run following python code

import torch
torch.cuda.get_device_name(1)
cmstudyscode commented 6 years ago

Do you know how to modify the numbers of class. I have modifief it in faster_rcnn_vgg16.py ,but here is a bug:AssertionError: number of predictions does not match size of confusion matrix. I think i need to modify the number of class in output, but i haven't found this, do you know?

cmstudyscode commented 6 years ago

@kingxueyuf Do you know how to modify the numbers of class. I have modifief it in faster_rcnn_vgg16.py ,but here is a bug:AssertionError: number of predictions does not match size of confusion matrix. I think i need to modify the number of class in output, but i haven't found this, do you know?