yaohungt / Multimodal-Transformer

[ACL'19] [PyTorch] Multimodal Transformer
MIT License
827 stars 152 forks source link

Hyperparameters Setting #5

Closed BinWang28 closed 5 years ago

BinWang28 commented 5 years ago

Hi author,

Many thanks for sharing the code. I am trying to obtain the result from the paper for CMU-MOSEI dataset. I am using the current released version and changed the Hyperparameters setting for CMU-MOSEI dataset according to the appendix. (refer: Table 5 in your paper.)

But currently I haven't got similar results (About 1-2% lower for accuracy.) Is there anything else need to be modified for me to obtain similar results? Appreciate for any help.

Have tried for both aligned and unaligned. Haven't tried the other two datasets yet.

yaohungt commented 5 years ago

CMU-MOSEI is a very small dataset which is easy to overfit. You may want to try more random seeds or do some regularization/ early stopping for preventing overfitting.

jerrybai1995 commented 5 years ago

Just to add to @yaohungt's comment above--- I just tried again and is able to reproduce most of the numbers for CMU-MOSEI in one run. I would suggest slightly tuning the dropout rates (especially the embedding dropout and residual dropout) and the number of levels of the transformers, while re-running a few times with different seeds (you may sometimes get results better than in the table).

BinWang28 commented 5 years ago

Many thanks for the reply. I will try a few more time and see how it works.

sylvia5monthes commented 4 years ago

@BinWang28 were you able to get the results? I'm having a similar issue.

JKBox commented 4 years ago

@jerrybai1995 could you kindly share the hyperparameters in MOSEI aligned dataset?