I read in the paper that you set a weight-decay in the optimizer, but I didn't see that term in your initialization of optimizer in main.py here. I wonder if I have skipped something or you really didn't set the regularization in your code? Thanks.
For the regularization, yes we found the regularization does not affect the results a lot, so we removed this part. As the dropout is already a good way to avoid overfitting.
I read in the paper that you set a weight-decay in the optimizer, but I didn't see that term in your initialization of optimizer in main.py here. I wonder if I have skipped something or you really didn't set the regularization in your code? Thanks.