pliang279 / MFN

[AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning
MIT License
113 stars 30 forks source link
machine-learning multimodal-learning

Memory-Fusion-Network

Code for Memory Fusion Network (MFN), AAAI 2018, https://arxiv.org/abs/1802.00927

This repository includes data, code and pretrained models for the AAAI 2018 paper, "Memory Fusion Network for Multi-view Sequential Learning"

Data: we have included preprocessed data from the CMU-MOSI dataset for multimodal sentiment analysis. These are found in data/X_train.h5, data/y_train.h5 etc. To be consistent with previously reported results on the CMU-MOSI dataset, we used the exact same dataset as used in the baselines. We are in the process of integrate the model with the latest version of the CMU-MOSI and CMU-MOSEI datasets which can be found at https://github.com/A2Zadeh/CMU-MultimodalSDK/

Code: training code for both MFN and EF-LSTM (early fusion LSTM) are included in test_mosi.py

Pretrained models: pretrained MFN models optimized for MAE (Mean Absolute Error) and binary classification accuracy can be found in best/mfn_mae.pt, and best/mfn_acc.pt

Installation

First check that the requirements are satisfied:
Python 2.7
PyTorch 0.4.0
numpy 1.13.3
sklearn 0.20.0

If not, these packages can be installed using pip.

The next step is to clone the repository:

git clone https://github.com/pliang279/Memory-Fusion-Network.git

You can run the code with

python test_mosi.py

in the command line. This loads the pretrained model best/mfn_mae.pt which gives a CMU-MOSI test set MAE of 0.954, and the pretrained model best/mfn_acc.pt which gives a CMU-MOSI test set binary classification accuracy of 77.4%.

Next steps: we are in the process of integrating the model with the latest version of the CMU-MOSI and CMU-MOSEI datasets which can be found at https://github.com/A2Zadeh/CMU-MultimodalSDK/

If you use this code, please cite our paper:

@article{zadeh2018memory,
  title={Memory Fusion Network for Multi-view Sequential Learning},
  author={Zadeh, Amir and Liang, Paul Pu and Mazumder, Navonil and Poria, Soujanya and Cambria, Erik and Morency, Louis-Philippe},
  journal={Proceedings of the Thirty-Second {AAAI} Conference on Artificial Intelligence},
  year={2018}
}

Related papers and repositories building upon these datasets:
CMU-MOSEI dataset: paper, code
Multi-Attention Recurrent Network: paper, code
Graph-MFN: paper, code
Multimodal Transformer: paper, code
Multimodal Cyclic Translations: paper, code