Closed iris0329 closed 4 years ago
Hello @iris0329
No the pretrained model is not trained with the uncertainty. The uncertainty is applied after training following A General Framework for Uncertainty Estimation in Deep Learning
Could you give me more information about your training? I never encountered this type of issues, no.
Thanks for your prompt reply!
So that means, if I apply uncertainty on this pretrained model, I could get 59.5 point-wise mean-IoU
in the Table I of paper?
The batch size is set to 32 and I used 8 blocks of TITAN X (Pascal) GPU. All the other Settings are the default Settings. It looks very confusing
I retrained it, and it can be seen that there is a tendency of overfitting at EPOCH 55. Thanks.
Yes, which was evaluated on the test set.
Does that tendency continues if it trains after 55 epochs?
Best,
Ok I will check this as soon as possible!
Hello @iris0329! Sorry for the delay on this question. I had to go back to my logs to check how the submission training (and the other experiments went) to check our behaviour.
I mislead you earlier and we had a similar issue with an apparent overfitting. Also our baseline comparison (rangenet++) also shows this apparent sign when we trained it from scratch.
Again, sorry for the delay and misleading answers before, I didn't have in my mind a clear idea of our val_loss plots!
Sorry, I thought about this problem before, but forgot to update it here.
I think it is a problem caused by Cross-Entropy Loss.
Think about the Matrix_A before argmax function, its size is (h, w, class_num). After applying the argmax function on this Matrix_A, we got a new Matrixd_B and its size is (h, w, 1).
The increment IoU means that the Matrix_B becomes more accurate. However, the Cross-Entropy Loss is computed based on Matrix_A.
For example,
There are two pixels and ground truth is [2, 0]. At the beginning, Matrix_A is [[0.1, 0.2, 0.7], [0.1, 0.2, 0.7]], after applying the argmax function, we got the prediction [2, 2], 50% accuracy rate.
With the training process continuing, the Matrix_A becomes to [[0.2, 0.3, 0.5], [0.5, 0.3, 0.2]] after applying the argmax function, the prediction is [2, 0], the loss will increase but the accuracy is higher.
So it seems, as the training progressed, there was a tendency for the predictions to average out across all categories. But I still don't know how to solve it. If anyone has a suggestion about it, I would be very grateful.
Hi, Thanks for your generous opening source of this briliant project!
I have two questions for the project:
Does pretrained model use uncertainty or not during training? Is this pretrained model the one which could reproduce
59.5 point-wise mean-IoU
in the Table I of paper?When I train the model on my self, I found that there is a serious overfitting problem. The pictures below are my training loss and valid loss curve.
I want to ask why this problem occurs? Do you also have this problem during training?
I am looking forward to your reply!