GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
MIT License
221 stars 18 forks source link

理解问题 #24

Open shenyujie1125 opened 1 year ago

shenyujie1125 commented 1 year ago

优化目标函数是两个模态共同作用的结果,OGM单独存在时能提升分类的准确率,是不是可以理解成,跳出局部最优点,找到了全局最优点呢?

shenyujie1125 commented 1 year ago

best_model_of_dataset_CREMAD_Normal_alpha_0.1_optimizer_sgd_modulate_starts_0_ends_50_epoch_23_acc_0.6155913978494624.pth best_model_of_dataset_CREMAD_OGM_GE_alpha_0.1_optimizer_sgd_modulate_starts_0_ends_50_epoch_87_acc_0.6169354838709677 仅改变了OGM_GE Normal这个参数,其他参数保持不变,但是我却得到了相似的结果,不是很理解,无法复现表格1中的数据

shenyujie1125 commented 1 year ago

还是没法理解在不使用你的modulation method时,也能达到0.6155的准确率,而你的table1中的表格显示是0.5几,你的方法改变了不同模态的梯度下降速度,使得对于欠优化的模态表示得到进一步优化,但是怎么做到得到一个更高的准确率的呢,得到更高准确率就意味着从局部最优解变为全局最优解,在我看来你也没有增加额外的参数啊,想不明白,期待您的解释!谢谢

echo0409 commented 1 year ago

Hi,

Our method improves the optimization of multi-modal learning, bringing better generalizaion ability (indiated by the better peformance in experiments). We do not have a no guarantee that we will eventually reach a global minima.

In addition, the performance of baseline concat on CREMAD can be caused by the different experiment settings.

Thanks for your attention!