Closed MJITG closed 4 years ago
Hi @MJITG , How about the result after retraining the derived model? And get lower result in search phase seems like a normal case. The authors only get near 0.35 mIoU in the origin paper.
How do you think about the lower results in the search phase? a normal case? @HankKung
In my experience, the accuracy in the search doesn't absolutely represent the performance of the derived model. Whether it is normal depends on the derived arch, if the arch is good then fine, otherwise, that's not a proper search. @MJITG Can you show the arch you searched? If there are some pooling operations, that should mean the search model simply hasn't converged and caused slightly low accuracy. In this case, try a larger lr or num of the epoch.
Also, the accuracy will get improved when the alpha value of skip connection increases, due to the boosting of the gradient. So if you get good searching accuracy but derive an unexpected cell architecture (e.g., skip connection) I'll suggest slightly increase the L2 penalty (weight decay) to get less sharp inner and outer loss.
Thanks for your great suggestions! I'll try them later!
Thanks for your great coding job. But I got only 0.3131mIoU result at the end of training and searching. Could you please give me any advice?