Open lianxxx opened 4 years ago
For me the learning rate of 1e-13 leads to numeric instabilities. I don't know how the authors are able to use these small learning rate. I train it with U-Net and Deeplab v3. I trained it on noisy medical images.
I used the SGD optimizer
U-Net DenseCRF:
DeepLab DenseCRF:
epsilon: learning rate λ: Regularization μ: Momentum
Dear author, recently I built a new environment with pytorch 1.7.1 and cuda 11.2, then I compiled with errors. It shows /bin/sh: 1: :/usr/local/cuda/bin/nvcc: not found, but nvcc is there. Do you have any suggestions? Thanks very much.
Thanks for your repo! I tried to put this CRFasRNN module in a ResNet based segmentation network but the loss did not decrease. Could you please give any suggestion on this situation, such as how to choose hyper-parameters, how to initialise the CRF layers. I am confused why in the original paper, 1e-13 such a small learning rate was chosen, any reason for it??
Any replies are much appreciated.