no checkpoint found at 'logs/voc2007/model_best.pth.tar'

yeezhu / SPN.pytorch

PyTorch implementation of "Soft Proposal Networks for Weakly Supervised Object Localization", ICCV 2017.

http://yzhu.work/spn.html

MIT License

211 stars 37 forks source link

no checkpoint found at 'logs/voc2007/model_best.pth.tar' #7

Closed zhihuilics closed 6 years ago

zhihuilics commented 6 years ago

I tried to run the demo but it stuck at:

=> no checkpoint found at 'logs/voc2007/model_best.pth.tar'

Could you please give me advice on how to proceed?

yeezhu commented 6 years ago

@zhihuilics "no checkpoint found" means you are training a new model. It seems that your dataloader get stuck at the beginning of the first epoch. Please check the size of your shared memory segment (df -h | grep shm)

zhiweichen0012 commented 6 years ago

@yeezhu I also meet the same problem, see below for more details => no checkpoint found at 'logs/voc2007/model_best.pth.tar Training: 0%| | 0/79 [00:00<?, ?it/s]

~$ df -h | grep shm
tmpfs           2.0G   42M  1.9G   3% /dev/shm

I don't know what the right size of shared memory segment is? Thanks. In addition, it also stuck at: => RuntimeError: CUDNN_STATUS_ALLOC_FAILED Could you please give me advice on how to proceed?

zhiweichen0012 commented 6 years ago

@yeezhu i have solved the problem, thank you all the same. @zhihuilics do you solve it ?

yeezhu commented 6 years ago

@zhiweichen12 Hello, I'm glad that you've solved your problem :P

wlj567 commented 2 years ago

@zhiweichen0012 Hello, how did you solve it? Can you tell me?