Identical values for state and next state while iterating episode

baggepinnen commented 5 years ago

This commit 28d5953 introduced a bug. While iterating an episode, identifcal values are obtained for s and s1, e.g.

for (s,a,r,s1) in ep 
    ...
end

debug> s
2-element PyArray{Float64,1}:
 -0.4688434880873431   
  0.0003026530939645928

debug> s1
2-element PyArray{Float64,1}:
 -0.4688434880873431   
  0.0003026530939645928

JobJob commented 5 years ago

Yep, I've run into this issue in a different form in my own code. This is one of the reasons I hadn't merged #12, another important one being I haven't had time to work on/check it with latest PyCall.

I think I'll get a chance to clean this up in the next few days. The likely fix will be to rollback #12 - which could be done now as a stopgap if you wanted @iblis17 ? Sorry I wasn't around to comment there in the last few days. From memory, when I last checked, the speed increase was smaller than it was when I originally made the PR, and I think returning a new array matches the Python API better, and will be less unexpected for users.

If it does turn out to be significantly slower, then another path is to modify the Episode iterator to make a copy for Gym environments, but I'm not certain PyArray will be deepcopy-able without some modifications.

Anyway, it's important to me for things to be fast, so will try make sure I can keep them that way. Will hopefully have a look at #13 again too when I work on this.

iblislin commented 5 years ago

I think I'll get a chance to clean this up in the next few days. The likely fix will be to rollback #12 - which could be done now as a stopgap if you wanted?

Rollbacking is cheap in git. just go for it.

iblislin commented 5 years ago

something like git revert HEAD

JuliaML / OpenAIGym.jl

Identical values for state and next state while iterating episode #32