Model training problem with SGD and Adam optimizer

GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)

MIT License

242 stars 19 forks source link

Hello, I'm trying to apply OGM-GE strategy to multimodal fusion network with text, video and audio modalities(e.g. MISA, MAG). However, when I use SGD optimizer, the model training process moves on with difficulty and finally achieves very low accuracy. Then I replace with Adam optimizer, but it seems that OGM_GE strategy doesn't work and the model training process is still dominated by text modality. Did these problems appear in your experiment? And how can I solve them? Looking forward to your reply.

GeWu-Lab / OGM-GE_CVPR2022

Model training problem with SGD and Adam optimizer #29