Why 'grad' should be filtered by its size?

GeWu-Lab / OGM-GE_CVPR2022

The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)

MIT License

236 stars 18 forks source link

Why 'grad' should be filtered by its size? #22

Open pmj110119 opened 2 years ago

pmj110119 commented 2 years ago

Thanks for your amazing work!

When weighting gradients, I noticed that grad will be filtered by its size.

if 'audio' in layer and len(parms.grad.size()) == 4:

I wonder what is the point of this step?

echo0409 commented 2 years ago

Hi,

Thanks for your attention. The goal of this constraint is to modify the parameters of conv layer.

auroraToT commented 1 year ago

Hello, I also want to ask this line of code.

Why just modify the parameters of conv layers instead of other layers?

If I use an new multimodal model and I have other layers (instead of conv) before the fusion layer, do I need to modify the parameters of this layer?

patrontheo commented 1 year ago

+1 What if my model does not have conv layers. Should I update all the parameters ? or all except pooling ones ? Thank you ! Theo

echo0409 commented 1 year ago

Hi, in our framework, the backbone is composed of conv layer, and the fc layer is the classifier. In our case, we modulate the gradient of the backbone, i.e., conv layers. If there are fc layers in your backbone, it needs to do modulation for these fc layers.