fartashf / vsepp

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"
Apache License 2.0
489 stars 125 forks source link

loss gap between train and test #45

Closed wwg8270 closed 2 years ago

wwg8270 commented 2 years ago

in the final epoch , the training loss is much less than the test loss, is this a overfitting problem. If so , overfitting occurs in the second epoch. detail as belows

2021-12-22 08:38:09,866 Epoch: [29][3223/3234] Eit 97010 lr 2e-05 Le 17.5955 (16.2388) Time 0.054 (0.000) Data 0.037 (0.000) 2021-12-22 08:38:10,415 Epoch: [29][3233/3234] Eit 97020 lr 2e-05 Le 9.2004 (16.2394) Time 0.054 (0.000) Data 0.038 (0.000) 2021-12-22 08:38:10,455 Test: [0/40] Le 65.0238 (65.0238) Time 0.040 (0.000) 2021-12-22 08:38:10,861 Test: [10/40] Le 64.3841 (64.6037) Time 0.040 (0.000) 2021-12-22 08:38:11,264 Test: [20/40] Le 64.7672 (64.5425) Time 0.041 (0.000) 2021-12-22 08:38:11,670 Test: [30/40] Le 64.4232 (64.6411) Time 0.041 (0.000) 2021-12-22 08:38:12,840 Image to text: 42.1, 75.0, 85.0, 2.0, 8.8 2021-12-22 08:38:13,407 Text to image: 33.6, 67.4, 80.3, 3.0, 18.0

fartashf commented 2 years ago

Overfitting is typically referred to when the test loss increases while the training loss decreases. A gap between train and test loss is not necessarily an indicator of overfitting. If you suspect overfitting, it makes sense to increase regularization to prevent it.