google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
Apache License 2.0
1.51k stars 181 forks source link

bsuite_tutorial.ipynb error - load bsuite environments as OpenAI gym- #11

Closed SoyGema closed 5 years ago

SoyGema commented 5 years ago

Hey there! Thanks for open sourcing this tool for understanding better behavior in RL agents :)

There seems to be an error in the colab tutorial when executing load bsuite environments as OpenAI gym cell

#@title Simple to load bsuite environments as OpenAI gym

from bsuite.utils import gym_wrapper
raw_env = bsuite.load_from_id(bsuite_id='memory_len/0')
env = gym_wrapper.GymWrapper(raw_env)
isinstance(env, gym.Env)

might

env = gym_wrapper.GymWrapper(raw_env)

be

env = gym_wrapper.GymFromDMEnv(raw_env)

, like the documentation pinpoints ?

My apologies for not submitting a PR here, I was not able to access the Colab doc .

Have a nice day !

yotam commented 5 years ago

You are right. We renamed the class soon after launch and did not update this colab. We'll fix it as you suggest. Thanks for spotting this!

iosband commented 5 years ago

Fixed - thanks!

SoyGema commented 5 years ago

Awesome! Just realized the method is also called at the end of the tutorial , in Build your own agents however you like! cell , in case that you also might want to change that .

from baselines.common.vec_env import dummy_vec_env
from baselines.ppo2 import ppo2
from bsuite.utils import gym_wrapper
import tensorflow as tf
​
​
SAVE_PATH_PPO = '/tmp/bsuite/ppo'
​
def _load_env():
  raw_env = bsuite.load_and_record(
      bsuite_id='bandit_noise/0', 
      save_path=SAVE_PATH_PPO, logging_mode='csv', overwrite=True)
  return gym_wrapper.GymWrapper(raw_env)
env = dummy_vec_env.DummyVecEnv([_load_env])
​
ppo2.learn(
    env=env, network='mlp', lr=1e-3, gamma=.99,
    total_timesteps=10000, nsteps=100)

Thanks again for fixing it ^^

yotam commented 5 years ago

Thanks again, I have just updated that part as well.