Closed swordfate closed 2 years ago
You are correct about the training of CV and NLP models with multiple epochs. However, the training of DLRMs with multiple epochs is an open problem. The code supports it, but you will see that while the training accuracy continues to drop, the testing accuracy will increase in the subsequent epochs.
You are correct about the training of CV and NLP models with multiple epochs. However, the training of DLRMs with multiple epochs is an open problem. The code supports it, but you will see that while the training accuracy continues to drop, the testing accuracy will increase in the subsequent epochs.
Do you mean the training loss continues to drop, the testing loss will increase in the subsequent epochs? So, can I think this is due to multiple epochs training for recommendation system will overfit the training dataset?
You can think of it in terms of a loss. It's possible that this is related to over fitting, but it's not clear exactly what happens. You can give it a try yourself, just use multiple epochs during training to see the corresponding effect.
Hi @mnaumovfb
You mentioned:
You are correct about the training of CV and NLP models with multiple epochs. However, the training of DLRMs with multiple epochs is an open problem. The code supports it, but you will see that while the training accuracy continues to drop, the testing accuracy will increase in the subsequent epochs.
It's weird that the training accuracy continues to drop while the testing accuracy will increase in the subsequent epochs. Do you mean the opposite (i.e. over-fitting) ?
The accuracy curves shown in README.md are for only one epoch with the Kaggle or Terabyte dataset. However, as we know, deep learning models for NLP are usually trained with multiple epochs with random data augmentation to achieve better accuracy.
So, my question is, why is the DLRM model only trained for one Epoch? Is it necessary to train multiple epochs with the same data?
Looking forward to your reply, thank you very much :)