numenta / nupic.embodied

GNU Affero General Public License v3.0
6 stars 6 forks source link

Code cleanup, refactor of the update function, and improvements to backprop through reward #27

Closed lucasosouza closed 3 years ago

lucasosouza commented 3 years ago

@vkakerbeck This PR includes the changes in the first one plus a few more commits. Major changes:

It would be easier to review this PR only after merging the previous one, so it will only show the diff for the recent commits.

If you prefer you could also review this directly, with the changes combined, and discard the previous one. I would merge them separately though, since I've made a lot more changes in this one, and we want to be able to rollback these changes in case it introduces a bug that impact ongoing experiments.

edit: I've done some more changes to consider what was discussed on slack, with alternate steps of updating the dynamics model and the policy when backpropagating through rewards