bjlkeng / sandbox

Play time!
MIT License
195 stars 71 forks source link

change parameter b in learning rate s.t. it reflects paper #9

Closed davetornado closed 1 year ago

davetornado commented 1 year ago

You forgot a small "-1" term when calculating the learning rates which causes your learning rates to be too small. This is the reason you dont recreate the Welling paper and can only barely find the second mode.

bjlkeng commented 1 year ago

Thanks @davetornado, brilliant find! I fixed it in the main branch because the notebook was too hard to merge (serves me right for using notebooks). Your bug uncovered a couple of other bugs I had, so I fixed those too.

I'm able to find both modes now without too much trouble, although the balance between the two is a bit off. I'll update the blog post momentarily.

davetornado commented 1 year ago

Great! Thanks for creating the blog and the git repo to begin with.