LARC-CMU-SMU / FoodSeg103-Benchmark-v1

MM'21 Main-Track paper
Apache License 2.0
105 stars 32 forks source link

Overfittting and uderfitting #10

Closed Mark1Dong closed 3 years ago

Mark1Dong commented 3 years ago

Hi, sorry for disturbing you again. Recently, I met a question, when I utilized Segformer model on this dataset, the performance is very bad, which is about 20 points lower than the Swin. So I think it's overfitting or underfitting of the model, in other words, the width of model cause the low performance. So, what's your opinion of this question. Second, about the fitting problem, we can use the training loss and evaluation loss to verify my guess, after training, my log file save some loss value, however, I don't know what kind of loss that is. Second, about the auxiliary_decode used in MLA_SETR model, which purpose and function is?

Looking forward to receiving your reply!

XiongweiWu commented 3 years ago

@Mark1Dong Sorry I was busy with the ddl till yesterday. I also find it is common that the sota segmentation methods fail to achieve similar improvement in food image segmentation as in other datasets, e.g, the performance of swin model cannot defeat setr, and heavy vit backbone may hurt the performance. I believe two reasons may potentially account for it: (i) the scale of foodseg103 is still limited; (ii) the challenges of food image segmentation (large variance & long-tail distribution) need specific module design to handle.

As for the auxiliary decoder in MLA_SETR (it's also used in other frameworks), we just follow the default settings. My personal understanding is that it can provide loss based on different scales

Mark1Dong commented 3 years ago

Thanks for your reply, and I absolutely agree with your thought on the auxiliary decoder