-
-
When I fuse rgb and audio ,the Ap of your paper is 78.64%. But if I use three multimodal, the AP is worse than your paper. In principle, more modal fusion effects will be better,the fact is not. I am …
-
Hi @bhaba-ranjan,
Thanks for sharing the repository, I'm currently looking to reproduce this work. Are the default hyperparameters that are set in the multimodal.py file enough to reproduce your mo…
-
Hi,
In the paper for Gated Multimodal Fusion you use a bit different formula than the one you have in the code?
for example concatenation between img_new_resize and tweet_new_resize became sum in t…
-
Hi, I am going to submit my paper about semantic segmentation. I am wondering which subject should I choose. Could you please share you choice about the SUBJECT AREAS with me?
Subject Areas:
Deep …
-
Hello!
I could run successfully the face recognition part.
However, running your code for the voice recognition part is raising problems. Could you please share your dataset and triplet_loss_trai…
-
positive = -(mu - y)**2/2./torch.exp(logvar)
Is "positive" vector (above in line 152) for the p(y|x) ~ N(y|µθ1(x), σ2 (x) I)? where is the -(lnσ + C) items in the probability density function for …
-
Are there any ways to bypass the data-preprocessing step for MBT ("Attention Bottlenecks for Multimodal Fusion") if I only wanna do inference without passing in the actual data from AS? I notice the m…
-
Hi~
Can you provide the pretrained checkpoint?
It will be great if you can provide one.
-
your paper only show the result of each modality, how about the fusion result of RGB depth and flow?