Can you share the hyperparameter configuration?

yaohungt / Multimodal-Transformer

[ACL'19] [PyTorch] Multimodal Transformer

MIT License

827 stars 152 forks source link

Hi, also looking for exact arguments used to get results in paper, this should really be provided. I'm trying to use the details on the hyper parameters given in the paper but currently I can't reproduce results in the paper.

Closest I've gotten is about 3% off in test accuracy (multi class) which I think is substantial, but runs (depending on seed) are all over the shot with some being up to 8% off. This is both trying different seeds and the default seed. Seems like seed hacking to me? or worse. Unless the hyper parameters are off which can't be assessed unless exact arguments are given.

This issue has been raised and confirmed by many users now.

I've seen in other issues the authors have simply said to refer to the paper but it doesn't give enough information and it's difficult to match hyperparameters to the arguments.

@yaohungt @jerrybai1995 @bryant1410

yaohungt / Multimodal-Transformer

Can you share the hyperparameter configuration? #51