GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
MIT License
242 stars 19 forks source link

Model training problem with SGD and Adam optimizer #29

Open zhougr18 opened 1 year ago

zhougr18 commented 1 year ago

Hello, I'm trying to apply OGM-GE strategy to multimodal fusion network with text, video and audio modalities(e.g. MISA, MAG). However, when I use SGD optimizer, the model training process moves on with difficulty and finally achieves very low accuracy. Then I replace with Adam optimizer, but it seems that OGM_GE strategy doesn't work and the model training process is still dominated by text modality. Did these problems appear in your experiment? And how can I solve them? Looking forward to your reply.

echo0409 commented 1 year ago

Hi, thanks for your interest.

We haven't test text modality before. But in my perspective, this phenomennon perhaps because that, the training and learning of text modality, whcih is a highly abstraction of human knowledge, are quite different from more raw modality (e.g., audio and vision). So, modifying gradient of modality (like OGM do), maybe not quite usefull for text modality.