Open cooodeKnight opened 5 years ago
And in loss_function , i try to print the loss L_t :print(L_t.data)
and the log is:
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCReduceAll.cuh line=327 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train.py", line 262, in
there is a problem when i'm trying to train: here is the log: =============> Loading args Namespace(dataDir='./data/dataset', finetuning=False, load='human_matting', lr=0.001, lrDecay=100, lrdecayType='keep', nEpochs=100, nThreads=4, patch_size=320, saveDir='./ckpt', save_epoch=10, trainData='human_matting_data', trainList='./data/train_list.txt', train_batch=8, train_phase='pre_train_t_net', without_gpu=False) ============> Environment init ============> Building model ... ============> Loading datasets ... Dataset : file number 1700 ============> Set optimizer ... ============> Start Train ! ... /usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:2539: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorMath.cu line=26 error=59 : device-side assert triggered Traceback (most recent call last): File "train.py", line 261, in
main()
File "train.py", line 230, in main
loss.backward()
File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python3.5/dist-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26
what's wrong with it?