Open cyang810 opened 1 year ago
Hi, also looking for exact arguments used to get results in paper, this should really be provided. I'm trying to use the details on the hyper parameters given in the paper but currently I can't reproduce results in the paper.
Closest I've gotten is about 3% off in test accuracy (multi class) which I think is substantial, but runs (depending on seed) are all over the shot with some being up to 8% off. This is both trying different seeds and the default seed. Seems like seed hacking to me? or worse. Unless the hyper parameters are off which can't be assessed unless exact arguments are given.
This issue has been raised and confirmed by many users now.
I've seen in other issues the authors have simply said to refer to the paper but it doesn't give enough information and it's difficult to match hyperparameters to the arguments.
@yaohungt @jerrybai1995 @bryant1410
Can you share the hyperparameter configuration? According to the settings in the paper, it cannot achieve the effect in the text. The following are the commands I set in the text. Is there a problem? (aligned MOSEI dataset)
--aligned --batch_size=16 --nlevels=4 --num_heads=8 --embed_dropout=0.3 --attn_dropout=0.1 --out_dropout=0.1 --clip=1.0 --num_epochs=20 --when=10 --attn_dropout_a=0.1 --attn_dropout_v=0.1
Looking forward to your reply, which will be a great encouragement for beginners