nyu-dl / dl4mt-tutorial

BSD 3-Clause "New" or "Revised" License
618 stars 249 forks source link

Latest adam with recommended parameters #57

Closed bastings closed 8 years ago

bastings commented 8 years ago

So here it is :-) This Adam code corresponds to the latest version of the Adam paper (i.e. the parameters beta1 and beta2 have been inverted compared to the adam version in this repository). The default values for beta1, beta2 and epsilon are the recommended ones. This is in line with other Adam implementations.

Another change is that the learning rate parameter for Adam is now controlled by the learning rate in options.

orhanf commented 8 years ago

@bastings thank you for the fix, merging.