pliang279 / MFN

[AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning
MIT License
113 stars 30 forks source link

my own data #6

Open justlovebarbecue opened 5 years ago

justlovebarbecue commented 5 years ago

Hi, Thanks for sharing code! But could you please provide more details about how to use our own data to run this model? Thanks!

justlovebarbecue commented 5 years ago

Hi, I have revised your code and feed my own data into model, however, I obtained a low accuracy like around 20%. My dataset is about time series classification with 20 classes. I checked the dataset you provided, the labels are not integers with decimal point like 2.5, 4.5, etc. Could you provide any insight about it? is this caused by the values of label? THX~

ghost commented 5 years ago

@justlovebarbecue, happy to help. The reason we have data that is in floats as opposed to ints is that we have three scores for sentiment from 3 master-level crowd workers. These scores are averaged which naturally leads to float numbers (you can refer to original CMU-MOSI paper for further reference). Can you give us a description of your task/data? How many modalities do you have? What is your parameter space?

justlovebarbecue commented 5 years ago

Hi, thx for reply. I have 3 views time series data. I directly revised your code "test_mosi.py": comment your "load_saved_data" function and load my own data with shape (num_of _sample, time_step, feature_dim). For parameter space, just exactly the same as provided in code except the "input_dims", I revised it with my own view split.

ghost commented 5 years ago

It is not unexpected if you get suboptimal results on a new dataset with parameters we tuned on CMU-MOSI. My suggestion is to try to "debug" your modalities by seeing if any of them are getting good performance. For example, just run a LSTM and see how the performance is. From fine-tuning each modality, you may learn for example that your data in each modality is not in the same order magnitude and that causes optimizers to not work on all three modalities together. When you debug your modalities, you can run MFN again and hopefully see performance improvement.

justlovebarbecue commented 5 years ago

Hi, Thank you so much for your kind reply. Will have a try following your hints. THX!

ghost commented 5 years ago

@justlovebarbecue Of course. One thing about multimodal models is they are hard to optimize. Following a grid search, you may find regions within the hyperparameter space that work better on dev set and you can then focus on those. Keep us in the loop!

justlovebarbecue commented 5 years ago

OK! Thanks for your advice! Will try it! :)

justlovebarbecue commented 5 years ago

Hi, I noticed that there is another method called "EFLSTM". Is that just a basic LSTM model using concatenated data along with the feature dimension from different views? I have read the code and want to check it here. THX

ghost commented 5 years ago

@justlovebarbecue Yup that's it. It's just word-level feature concatenation that goes into a LSTM.

justlovebarbecue commented 5 years ago

@A2Zadeh OK. Got it. Thank you so much!

ghost commented 5 years ago

@justlovebarbecue You are welcome! Good luck.

justlovebarbecue commented 5 years ago

Hi @A2Zadeh , I am trying to replicate your results on Youtube dataset. I have some questions below:

  1. Can I use the same "test_mosi.py" to do that?
  2. Could you please tell me the settings of hyperparameter and I plan to tune it from single view.
  3. For the Youtube dataset contained in "new_data" fold, is this the right version for experiments? compared with the results on paper. Thanks a lot!