理解问题 - Githubissues

GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)

MIT License

221 stars 18 forks source link

理解问题 #24

Open shenyujie1125 opened 1 year ago

shenyujie1125 commented 1 year ago

优化目标函数是两个模态共同作用的结果，OGM单独存在时能提升分类的准确率，是不是可以理解成，跳出局部最优点，找到了全局最优点呢？

shenyujie1125 commented 1 year ago

best_model_of_dataset_CREMAD_Normal_alpha_0.1_optimizer_sgd_modulate_starts_0_ends_50_epoch_23_acc_0.6155913978494624.pth best_model_of_dataset_CREMAD_OGM_GE_alpha_0.1_optimizer_sgd_modulate_starts_0_ends_50_epoch_87_acc_0.6169354838709677 仅改变了OGM_GE Normal这个参数，其他参数保持不变，但是我却得到了相似的结果，不是很理解，无法复现表格1中的数据

shenyujie1125 commented 1 year ago

还是没法理解在不使用你的modulation method时，也能达到0.6155的准确率，而你的table1中的表格显示是0.5几，你的方法改变了不同模态的梯度下降速度，使得对于欠优化的模态表示得到进一步优化，但是怎么做到得到一个更高的准确率的呢，得到更高准确率就意味着从局部最优解变为全局最优解，在我看来你也没有增加额外的参数啊，想不明白，期待您的解释！谢谢

echo0409 commented 1 year ago

Hi,

Our method improves the optimization of multi-modal learning, bringing better generalizaion ability (indiated by the better peformance in experiments). We do not have a no guarantee that we will eventually reach a global minima.

In addition, the performance of baseline concat on CREMAD can be caused by the different experiment settings.

Thanks for your attention!