Hi, there.
I have tried training your model on GAFA dataset from scratch, and there seems to be an overfitting.
The direction_loss and kappa_loss continue to decrease on the training dataset, however, the val_loss and val_mae hit mininum at the end of first epoch, and then keep increasing.
The learning rates for both direction_loss and kappa_loss start at 1e-4.
How did you avoid the overfit when you trained your model?
Thank you so much in advance
Hi, there. I have tried training your model on GAFA dataset from scratch, and there seems to be an overfitting. The direction_loss and kappa_loss continue to decrease on the training dataset, however, the val_loss and val_mae hit mininum at the end of first epoch, and then keep increasing. The learning rates for both direction_loss and kappa_loss start at 1e-4. How did you avoid the overfit when you trained your model? Thank you so much in advance