MengShen0709 / bmmal

[ACMMM 2023] BMMAL: Towards Balanced Active Learning for Multimodal Classification
https://arxiv.org/abs/2306.08306
Creative Commons Attribution 4.0 International
9 stars 1 forks source link

Question about unimodal gradient embedding #2

Open lemonsweetie opened 6 months ago

lemonsweetie commented 6 months ago

First of all, thank you for your excellent work! My question is how the scaling factor for unimodal gradient embedding fit to three modalities which is only given for two modes in your paper?

MengShen0709 commented 6 months ago

It is a very good question!

Lets assume there are two data samples with three modalities:

$x1$ has modality contribution $\Phi{m1}=0.5$, $\Phi{m2}=0.3$, and $\Phi{m3}=0.2$. $x2$ has modality contribution $\Phi{m1}=0.6$, $\Phi{m2}=0.3$, and $\Phi{m3}=0.1$. As you can see, $x_2$ here is more unbalanced compared to $x_1$. You can calculate the dominance degree $\rho(x)$ according to eq5. here, $\rho(x_1)=0.5$ and $\rho(x_2)=0.8$.

Then to sclae their gradient embeddings in a three modality case with eq13, we first need to identify the most dominant modality for each data sample, then scale for the rest modalities pair by pair with it. Take $x_1$ as an example, we will scale $m_1$ and $m2$ with weights of 1.0 and $1-(\Phi{m1} - \Phi_{m2}) = 0.8$, and scale $m_1$ and $m3$ with weights of 1.0 and $1-(\Phi{m1} - \Phi_{m3}) = 0.7$.

So as you can see, by doing so, we will guarteen:

I hope my explanation will solve your problem.

Cheers!

lemonsweetie commented 6 months ago

Thanks for your reply! Your answer solves my problem, but I wanted to ask if there was a slight error. For $x{1}$, the weights of $m{2}$ and $m{3}$ are $1-(\Phi{m1}-\Phi{m2})=0.8$, $1-(\Phi{m1}-\Phi_{m3})=0.7$ respectively.

MengShen0709 commented 6 months ago

Yes, you are right. Thank you for pointing out the error in my reply.