Closed Randylcy closed 5 years ago
I adopt batch size 4 and crop size 224 with the same device as yours. Did you try spec on the paper (batch size 2 and crop size 320)?
I adopt batch size 4 and crop size 224 with the same device as yours. Did you try spec on the paper (batch size 2 and crop size 320)?
why not we chat in wechat? 18463102232
I adopt batch size 4 and crop size 224 with the same device as yours. Did you try spec on the paper (batch size 2 and crop size 320)?
I have the out of memory problem when I want to enlarge the input size, maybe because I did something with the original code, and I'd better try ur code.
I adopt batch size 4 and crop size 224 with the same device as yours. Did you try spec on the paper (batch size 2 and crop size 320)?
I have the out of memory problem when I want to enlarge the input size, maybe because I did something with the original code, and I'd better try ur code.
I assume it's because you set num_channel 40! Yes, it is set 40 in the experiment of the paper but they certainly have done some optimization for the code.
I adopt batch size 4 and crop size 224 with the same device as yours. Did you try spec on the paper (batch size 2 and crop size 320)?
I have the out of memory problem when I want to enlarge the input size, maybe because I did something with the original code, and I'd better try ur code.
I assume it's because you set num_channel 40! Yes, it is set 40 in the experiment of the paper but they certainly have done some optimization for the code.
Yes! U R right, I verified it. And I find a very interesting thing that google just use one P100 GPU which only has 16G. search for 3 days. While we use V100, 32G , can not have the same input size and channel numbers with them, so strange. Maybe , it is about the codes. We must optimize the codes!
After splitting training dataset for weight and arch, the training time should be normal (3~4 days under batch size 2) which I've merged the new code into master.
Still, the bad GPU memory cost is a problem. I've consulted the paper author about this, it seems he didn't do any special optimization for the code and he suggested to try to enable cudnn benchmark which I tried but no much help.
Same problem, I can only use batch size 1 and crop size 224 on Tesla V100- 32G. Which makes no sense since the paper claim that they can finish it in 3 P-100 days. And I think that the bottleneck always lies on the network structure decided by channel numbers and crop size. It has very small space to optimize because there is too many connection and operation in the net.
in the original code, the crop size is set for 224, while the problem-"out of memory" will take place in my GPU device——Tesla V100- 32G, have any one meet this problem. Now, I make the crop size 128, and it works. And I think for esmantic segmentation, should we not crop the image size too small? look forward to good answers.