Open justlovebarbecue opened 5 years ago
Hi, I have revised your code and feed my own data into model, however, I obtained a low accuracy like around 20%. My dataset is about time series classification with 20 classes. I checked the dataset you provided, the labels are not integers with decimal point like 2.5, 4.5, etc. Could you provide any insight about it? is this caused by the values of label? THX~
@justlovebarbecue, happy to help. The reason we have data that is in floats as opposed to ints is that we have three scores for sentiment from 3 master-level crowd workers. These scores are averaged which naturally leads to float numbers (you can refer to original CMU-MOSI paper for further reference). Can you give us a description of your task/data? How many modalities do you have? What is your parameter space?
Hi, thx for reply. I have 3 views time series data. I directly revised your code "test_mosi.py": comment your "load_saved_data" function and load my own data with shape (num_of _sample, time_step, feature_dim). For parameter space, just exactly the same as provided in code except the "input_dims", I revised it with my own view split.
It is not unexpected if you get suboptimal results on a new dataset with parameters we tuned on CMU-MOSI. My suggestion is to try to "debug" your modalities by seeing if any of them are getting good performance. For example, just run a LSTM and see how the performance is. From fine-tuning each modality, you may learn for example that your data in each modality is not in the same order magnitude and that causes optimizers to not work on all three modalities together. When you debug your modalities, you can run MFN again and hopefully see performance improvement.
Hi, Thank you so much for your kind reply. Will have a try following your hints. THX!
@justlovebarbecue Of course. One thing about multimodal models is they are hard to optimize. Following a grid search, you may find regions within the hyperparameter space that work better on dev set and you can then focus on those. Keep us in the loop!
OK! Thanks for your advice! Will try it! :)
Hi, I noticed that there is another method called "EFLSTM". Is that just a basic LSTM model using concatenated data along with the feature dimension from different views? I have read the code and want to check it here. THX
@justlovebarbecue Yup that's it. It's just word-level feature concatenation that goes into a LSTM.
@A2Zadeh OK. Got it. Thank you so much!
@justlovebarbecue You are welcome! Good luck.
Hi @A2Zadeh , I am trying to replicate your results on Youtube dataset. I have some questions below:
Hi, Thanks for sharing code! But could you please provide more details about how to use our own data to run this model? Thanks!