Open miguelesteras opened 7 years ago
Ok, I'm pretty sure I'm done here. I think it's pretty thorough (maybe too thorough 🤓 , but there's no word limit so I'm not worried about this being a problem#0. You might want to give it a proof read, just to confirm you agree with my definition and formalisation of the parameters. Let me know when you do this, and if you give it the ok I'll close the issue 💪
Almost done, just need to add a brief description of the role of learning rate (alpha) and discount factor (gamma).