-
I run the code on multiple datasets that used in the paper. But I could not achieve the precision described in the paper. Can you publish the optimal hyperparameters of each experiment?
-
Summary: we need to provide a tool for the user so that he can run multiple training workers with different values of hyperparameters.
Below subtasks:
1. Develop the syntax for describing a strate…
-
Hi @ku21fan
Just want to know which hyperparameters (no. of iterations, lr, batch size, optimizer) are used to train TPS-ResNet-BiLSTM-Attn-case-sensitive model? also what was the final training los…
-
![image](https://user-images.githubusercontent.com/113030234/195081171-e3ef992a-0b27-42b0-90cf-a688db0b1e3a.png)
关于其中的lambda和beta想问问作者的取法,原文中只看到了lambda。
-
After training according to the configuration file of Breakout, the effect cannot reach 800+ as stated in the paper. Can anyone give us a result about Atari game, detailed configuration file, thank yo…
-
Does it require any adjustment? Should we change any hyperparameters etc.?
-
Hi,
Thanks for providing your code. My question is regarding the results in Appendix D.
What hyperparameter configuration is used for these tasks?
Thanks.
-
Hi,
Could you please provide the hyperparameters for the reproduction of the results? I have run your train.sh, however, I cannot reproduce the unconditional generation results. Beside, different s…
-
Thanks for sharing your code!
Can you share the hyperparameters of all models so that I can reproduce the results of the paper?
-
Hi @JunjieHu!
I am trying to reproduce results for XLM-R. The paper suggests that lr=3e-5 and effective bs=16 should be used for XLM-R.
It would be very helpful if you could share some more deta…