-
Hi,
We used this config to train AVE task on a 3090, and we used the procesed data you provided, but the accuracy we got is 73.31
python3 /code/AVE/main_trans.py --Adapter_downsample=8 --batch_siz…
-
I want to run the demo, but can not find pie_file/sitw_file/pubfig_file/Samples_Paired_pie_full.csv/pubfig_epa.csv listed in config.ini. How to generate those files from original dataset? and where ca…
-
Hey - I am unable to reproduce the reported zero-shot results. So far I tried it on MSRVTT and MSVD, I would appreciate it if you kindly have a look.
Here is what I got after running these 2 script…
-
Hello,
I'm working on reproduce the results in your paper "Attention Bottlenecks for Multimodal Fusion" and try to implement MBT for other audiovisual video classification tasks.
However, the pr…
-
May I ask if this code can provide a relevant reference paper?
-
Thank you very much for your colourful work. May I ask if you can share the datasets in your paper? I look forward to your reply.
-
Hello, can you tell me the relevant reference papers
-
Thanks for sharing the codes. Llama_adapter_v2_multimodal seems to be the impl. of llama adpater v1 paper. Then, how to reimpl. the results in v2 paper?
-
Hello, I'm trying to apply OGM-GE strategy to multimodal fusion network with text, video and audio modalities(e.g. MISA, MAG). However, when I use SGD optimizer, the model training process moves on wi…
-
[UAMD-Net: A Unified Adaptive Multimodal Neural Network for Dense Depth
Completion](https://arxiv.org/pdf/2204.07791.pdf)