JuliaML / OpenAIGym.jl

OpenAI's Gym binding for Julia
Other
105 stars 20 forks source link

Avoid shuffling underlying memory in observation #4

Open dmrd opened 8 years ago

dmrd commented 8 years ago

In the Go environment, the moves are passed as a (3,9,9) array from the python side, and the underlying memory seems shuffled on the Julia side. Probably want this to be a (9,9,3) array on the Julia side.

Need to find some way to reverse the dimensions and avoid modifying the memory every time (e.g. the revdims option in PyCall master). Unfortunately, this seems to only be going from the julia to numpy side

Compare (notice python is 0000...1111, while julia is 001001001...):

Python

In [71]: env = gym.make('Go9x9-v0')
[2016-05-06 17:47:38,641] Making new env: Go9x9-v0
Initializing Pachi engine uct with args threads=1,pondering=0

In [72]: env.reset().flatten()
Initializing Pachi engine uct with args threads=1,pondering=0
Out[72]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

Julia

julia> env = Env("Go9x9-v0")
[2016-05-06 17:48:15,540] Making new env: Go9x9-v0
Initializing Pachi engine uct with args threads=1,pondering=0
To play: black
Move:   0  Komi: 0.0  Handicap: 0  Captures B: 0 W: 0
      A B C D E F G H J
    +-------------------+
  9 | . . . . . . . . . |
  8 | . . . . . . . . . |
  7 | . . . . . . . . . |
  6 | . . . . . . . . . |
  5 | . . . . . . . . . |
  4 | . . . . . . . . . |
  3 | . . . . . . . . . |
  2 | . . . . . . . . . |
  1 | . . . . . . . . . |
    +-------------------+

julia> reset(env).observation[:]
Initializing Pachi engine uct with args threads=1,pondering=0
243-element Array{Int64,1}:
 0
 0
 1
 0
 0
 1
 0
 0
 1
...
tbreloff commented 8 years ago

Can I ask... What would you like to do with this data? If you'd just like a convenient way to index it the way you'd expect it's super easy to just wrap it yourself with an immutable and define your own getindex. Something more complex may need a copy. Let me know what you want to do with the state and I can help better.

On May 6, 2016, at 8:52 PM, David Dohan notifications@github.com wrote:

In the Go environment, the moves are passed as a (3,9,9) array from the python side, and the underlying memory is shuffled on the Julia side. Need to find some way to reverse the dimensions and avoid modifying the memory every time (e.g. the revdims option in PyCall master). Unfortunately, this seems to only be going from the julia to numpy side

Compare (notice python is 0000...1111, while julia is 001001001...):

Python

In [71]: env = gym.make('Go9x9-v0') [2016-05-06 17:47:38,641] Making new env: Go9x9-v0 Initializing Pachi engine uct with args threads=1,pondering=0

In [72]: env.reset().flatten() Initializing Pachi engine uct with args threads=1,pondering=0 Out[72]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) Julia

julia> env = Env("Go9x9-v0") [2016-05-06 17:48:15,540] Making new env: Go9x9-v0 Initializing Pachi engine uct with args threads=1,pondering=0 To play: black Move: 0 Komi: 0.0 Handicap: 0 Captures B: 0 W: 0 A B C D E F G H J +-------------------+ 9 | . . . . . . . . . | 8 | . . . . . . . . . | 7 | . . . . . . . . . | 6 | . . . . . . . . . | 5 | . . . . . . . . . | 4 | . . . . . . . . . | 3 | . . . . . . . . . | 2 | . . . . . . . . . | 1 | . . . . . . . . . | +-------------------+

julia> reset(env).observation[:] Initializing Pachi engine uct with args threads=1,pondering=0 243-element Array{Int64,1}: 0 0 1 0 0 1 0 0 1 ... — You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub

dmrd commented 8 years ago

It's easy enough to reshuffle again on the Julia side. It's more about avoiding overhead in the first place than convenience.

Go has a small state space, so the overhead isn't huge, but for environments where the state is much larger (e.g. an image), the overhead could add up.

tbreloff commented 8 years ago

What I'm saying is that if it's worthwhile to avoid a copy, you can create a no-copy version which just maps julia coordinates to python coordinates. I just wanted to understand what you're doing with the Array to make a judgement call on whether that us better or worse than just copying. Probably better!

On May 7, 2016, at 5:56 AM, David Dohan notifications@github.com wrote:

It's easy enough to reshuffle again on the Julia side. It's more about avoiding overhead in the first place than convenience.

Go has a small state space, so the overhead isn't huge, but for environments where the state is much larger (e.g. an image), the overhead could add up.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub

dmrd commented 8 years ago

I'll take a look at wrapping it at a lower level with a no-copy.

This isn't an immediate problem in terms of performance, but it seems like there must be a clean way in PyCall to do no-copy by default (I think it makes more sense Julia api wise). The revdims option I linked is relatively recent. I'm not sure if there's something similar to go from Python to Julia.

I ran into a related problems with wrapping Keras, where passing an array from Julia to Python with the proper dimensions would shuffle the underlying memory. It only mattered because it used a huge amount of memory while reshuffling - not a problem with the gym right now.