IntelLabs / coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
https://intellabs.github.io/coach/
Apache License 2.0
2.32k stars 459 forks source link

Ensure is_training is a tf variable, not a bool: fixes dropout/batchnorm #270

Closed jamescasbon closed 5 years ago

jamescasbon commented 5 years ago

The is_training parameter is currently a boolean which is fixed (to false) during the run. This PR changes it to be the general_network.is_training tf.Variable which is updated during the run.

I think this is a bug.

gal-leibovich commented 5 years ago

Thanks a lot @jamescasbon, nice catch!!

This is still failing some tests, though, and also requires some fixes also for heads (e.g. currently being used in DDPGActorHead and RegressionHead, but probably a more generic solution is preferable).

jamescasbon commented 5 years ago

:/ its quite difficult for me to run the tests as I can't run a docker image due to organisational insanity that makes fixing this quite hard

jamescasbon commented 5 years ago

I might have to leave it with you @galleibo-intel

I think the problem is clear?

gal-leibovich commented 5 years ago

Building and running a docker is not a necessity. You can run the tests locally without it, by following the instructions here - https://github.com/NervanaSystems/coach/tree/master/rl_coach/tests (ignoring the docker building). If this is something that you don't plan to carry on, that's ok, just let us know, and we'll imlement the fix. It will not be a part of this PR though (which will be closed).

jamescasbon commented 5 years ago

OK, I've updated most unit tests. I still have a few fail due to memory allocation errors but I don't see any mention of how much is needed on the tests doc.

gal-leibovich commented 5 years ago

This PR had some more minor fixes required in order for it to work correctly. Since the original branch is on a fork, we couldn't update it, so instead opened a new branch in this repo and a new PR #353 tracking it. We'll try to make sure that the right credit will be given to @jamescasbon!