Modal dimensions (audio, video) for the MOSI are different from values reported in paper?

souravBhat commented 3 years ago

Hi,

I am noticing discrepancy in model performance between runs on latest version of MOSI from the CMU multimodal SDK and the numbers reported in the paper. Upon digging further, turns out the dimensions of audio and video in the data provided with this repo are different (5 and 20 respectively) compared to the values reported in Appendix D of the paper (35 and 74 respectively). Could you please comment on the differences - am I missing something?

yehaizhi commented 3 years ago

Hi,

I am noticing discrepancy in model performance between runs on latest version of MOSI from the CMU multimodal SDK and the numbers reported in the paper. Upon digging further, turns out the dimensions of audio and video in the data provided with this repo are different (5 and 20 respectively) compared to the values reported in Appendix D of the paper (35 and 74 respectively). Could you please comment on the differences - am I missing something?

Hi,

I am noticing discrepancy in model performance between runs on latest version of MOSI from the CMU multimodal SDK and the numbers reported in the paper. Upon digging further, turns out the dimensions of audio and video in the data provided with this repo are different (5 and 20 respectively) compared to the values reported in Appendix D of the paper (35 and 74 respectively). Could you please comment on the differences - am I missing something?

Can you share the data set you downloaded with me? Because I can't download the data set shared by the author

h-0-0 commented 3 weeks ago

Also looking for clarification on preprocessing for MOSI. In #4 one of the authors says that they select some subset of features and to email for details. I think the details should really be public? @yaohungt @jerrybai1995 @bryant1410

Also the email given doesn't work for me, I think it's old.

yaohungt / Multimodal-Transformer

Modal dimensions (audio, video) for the MOSI are different from values reported in paper? #40