Closed gal-leibovich closed 5 years ago
this code change seems ok, but looks like it may have uncovered a couple of issues lurking in CartPole_DQN_BatchRL
and Pendulum_HAC
(see integration test output). Any idea what's going on there @galleibo-intel?
Yeah seems like it did. The reason for this PR was originally due to issues with the Rainbow DQN agent, which I didn't understand how come didn't the integration tests catch.
Now, there are two interesting things here:
regarding 1, looks OK to me.. regarding 2, I can investigate this issue - very interesting
While this issue is being debugged, could we approve this PR?
Yeah this seems to be the same kind of issue as in the BatchRL branch. Unfortunately, I don't know how to fish out the specific job where it was passing.
I have pushed a fix to the new issue that was revealed now, but there now seem to be new issues, this time with the build process (mujoco related?).
Yes, as discussed over on #264 I pushed updated Dockerfile
s to fix the Mujoco build issue, and then kicked off a CI re-run on your latest commit. Looks like we're back in business for building but still more to fix on the unit/integration side? Bugfix whack-a-mole continues :)
looks like gym just modified some of their interfaces yesterday and we're now getting hit by: https://github.com/openai/gym/commit/f5d571a16d18d3a64d56819c76e803a89149397c
Yeah seems like that might be it. I have pushed (yet another) fix.
It is now still failing on golden_test_gym
, as I'm guessing that the dockerfile is fixed to some older gym version.
That's right @galleibo-intel in the Dockerfile.gym_environment
we were pinning to 0.10.5 gym release but I've just relaxed this given your most recent changes on this branch. I've manually pushed this updated dockerfile and kicked off another run on this branch.
CartPole_ClippedPPO still shows as failing due to bad address
. This seems to be some CI hiccup, as the test passes for me locally. Merging this in.
integration test changes to override heatup to 1000 steps + run each preset for 30 sec (to make sure we reach the train part)