kenshohara / 3D-ResNets-PyTorch

3D ResNets for Action Recognition (CVPR 2018)
MIT License
3.87k stars 930 forks source link

RuntimeError :cuda runtime error (30) #91

Open chaofiber opened 6 years ago

chaofiber commented 6 years ago

Namespace(annotation_path='/home/nichao/data/kinetics.json', arch='resnet-34', batch_size=128, begin_epoch=1, checkpoint=5, crop_position_in_test='c', dampening=0.9, dataset='kinetics', ft_begin_index=0, initial_scale=1.0, learning_rate=0.1, lr_patience=10, manual_seed=1, mean=[114.7748, 107.7354, 99.475], mean_dataset='activitynet', model='resnet', model_depth=34, momentum=0.9, n_classes=400, n_epochs=200, n_finetune_classes=400, n_scales=5, n_threads=4, n_val_samples=3, nesterov=False, no_cuda=False, no_hflip=False, no_mean_norm=False, no_softmax_in_test=False, no_train=False, no_val=False, norm_value=1, optimizer='sgd', pretrain_path='', resnet_shortcut='B', resnext_cardinality=32, result_path='/home/nichao/data/results', resume_path='', root_path='/home/nichao/data', sample_duration=16, sample_size=112, scale_in_test=1.0, scale_step=0.84089641525, scales=[1.0, 0.84089641525, 0.7071067811803005, 0.5946035574934808, 0.4999999999911653], std=[38.7568578, 37.88248729, 40.02898126], std_norm=False, test=False, test_subset='val', train_crop='corner', video_path='/home/nichao/data/jpg', weight_decay=0.001, wide_resnet_k=2) /home/nichao/3D-ResNets-PyTorch-master/models/resnet.py:145: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaimingnormal. m.weight = nn.init.kaiming_normal(m.weight, mode='fan_out') THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1532579805626/work/aten/src/THC/THCGeneral.cpp line=74 error=30 : unknown error Traceback (most recent call last): File "main.py", line 47, in model, parameters = generate_model(opt) File "/home/nichao/3D-ResNets-PyTorch-master/model.py", line 165, in generate_model model = model.cuda() File "/home/nichao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/nichao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply module._apply(fn) File "/home/nichao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 191, in _apply param.data = fn(param.data) File "/home/nichao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in return self._apply(lambda t: t.cuda(device)) RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1532579805626/work/aten/src/THC/THCGeneral.cpp:74

Above is the compile comment, i don't know what is wrong. I followed the instruction as the code. My cuda is 9.1. Ubuntu is 16.04. Can anyone help? Thanks in advance.

chaofiber commented 6 years ago

Luckily I solved it. It seems that my session has suspended for a while, so this cuda runtimeerror arose. I just rebooted the machine and everything goes on well!

oscardssmith commented 5 years ago

This is pretty annoying. I just ran into it also. Is there any way we could make it so suspending didn't completely break things?

Miladiouss commented 5 years ago

This happens to me on my laptop all the time.

yuanzhoulvpi2017 commented 5 years ago

Luckily I solved it. It seems that my session has suspended for a while, so this cuda runtimeerror arose. I just rebooted the machine and everything goes on well!

yes, i have the same situation, while i take a break at noon .my computer has suspended which with ubuntu 18.10 . so i just rebooted my computer and the error has been gone