about hyperparameter setting and detail

jedyang97 / MTAG

Code for NAACL 2021 paper: MTAG: Modal-Temporal Attention Graph for Unaligned Human Multimodal Language Sequences

MIT License

43 stars 8 forks source link

about hyperparameter setting and detail #2

Closed ztqwdk closed 3 years ago

ztqwdk commented 3 years ago

Hi Jed. I am interested in your work MTAG, and I also think it's excellent. I just download the code, prepare the dataset and directly run 'bash run.sh', but I could not get a satisfactory result. Do you use the same hyperparameters in run.sh to get the numbers reported in the paper? If not, could you please tell me what are the correct hyperparameters to get the best performance? Are there any other details that need to be paid attention to during training?

Thanks.

jedyang97 commented 3 years ago

Hi @ztqwdk, thanks for your interest in our work! Note that you might be looking at the validation results when you directly run run.sh. To evaluate on the test set, please take a look at #1. Also please take a look at the appendix of the paper for the hyperparameters used for each dataset. Let me know if you need additional help!

ztqwdk commented 3 years ago

Hi @ztqwdk, thanks for your interest in our work! Note that you might be looking at the validation results when you directly run run.sh. To evaluate on the test set, please take a look at #1. Also please take a look at the appendix of the paper for the hyperparameters used for each dataset. Let me know if you need additional help!

Thanks for your reply @jedyang97 . Yes, I have noticed the #1 and I have evaluated on the test set. And the hyperparameter in run.sh, the default parameters in main.py, and the hyperparameters in the appendix, these three kinds of settings are all different. That made me confused. If it is convenient, could you please tell me specifically every args of main.py ?

jedyang97 commented 3 years ago

@ztqwdk I believe the ones in the appendix should be the ones to use.

ztqwdk commented 3 years ago

@jedyang97 The appendix tells me batch size, initial learning rate, optimizer, MTAG layers, attention heads, node embedding dimension, edge pruning keep percentage, and epochs. But there are still some other config settings in the args of main.py. I hope to get the whole ‘run.sh' file or specifical every args to make sure that I would not miss any detail. Thanks for your reply.

jedyang97 commented 3 years ago

@ztqwdk I compiled a more comprehensive hyperparameter list (along with each setting's performance we obtained) in this Google Sheet. For any parameters that are not specified here, we used the default values in main.py.