Open ashra-main opened 4 years ago
I figured out why there's a discrepancy between validation and training scores. the default value of train_type parameter is trainval, so it's actually training on the validation set as well. so after 1000 epochs, it was overfishing. I changed train_type to train to behave normally (I'd suggest using this as default behavior). However, still, the behavior of ContextNet and FastSCNN is not justified.
when I put these two models in eval() mode, the outputs become nonsense, but they are looking ok in train() mode. perhaps there's something wrong with the regularization methods!
Have you figured out the problem? I found the similar issues. I used 'train' for train_type. The training IoU and loss looked good, but when I used test.py, the results were very bad for ENet and bad for FSSNet. But the test.py is working fine for LEDNet. Not sure if there's any setting missing in the setup?
I trained 7 of the models and in all of them, I got more than %80 validation mIoU with the default settings (CamVid dataset after 1000 epochs). But when I tested the 1000th checkpoints with test.py code, I get these mIoUs: CGNet_0.651, ContextNet_0.060, DABNet_0.652, EDANet_0.288, ENet_0.590, ERFNet_0.672, FastSCNN_0.011.
So I'm wondering why there's a significant difference between validation and test scores? and why ContextNet and FastSCNN checkpoints are not trained?
I have Python 3.7 and Pytorch 1.4 hope we can solve the testing issues soon