Did not go on training - Githubissues

hhhwww123 commented 5 years ago

Thanks a lot to help me solve the ‘run slave’ .However, there is a new issue happen:

/home/weizhaoxiang/anaconda3/envs/pytorch1.0/lib/python3.6/site-packages/torch/nn/_reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead. warnings.warn(warning.format(ret)) Using poly LR Scheduler! Starting Epoch: 0 Total Epoches: 240 0%| | 0/371 [00:00<?, ?it/s] =>Epoches 0, learning rate = 0.0050, previous best = 0.0000 /home/weizhaoxiang/anaconda3/envs/pytorch1.0/lib/python3.6/site-packages/torch/nn/functional.py:2390: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") problem.

After the print, the code did not go on training or break, just stop at here. And you can not check there is a thread exist with the command 'nvidia-smi'.

How to solve the problem??

wuhuikai commented 5 years ago

Try to run it again and wait for a few minutes
Ctrl + C and then report the error message

hhhwww123 commented 5 years ago

Thanks，it work！！But it's thread still can't be found exist with the command 'nvidia-smi'

wuhuikai commented 5 years ago

Can not figure out it :( It works well on my machine.

wuhuikai commented 4 years ago

@hhhwww123 The code in branch latest is now ready, which uses official SyncBatchNorm and can run on all OS with PyTorch>=1.1.0.

wuhuikai / FastFCN

Did not go on training #35