Closed hermanjakobsen closed 3 years ago
Intuitively, the problem with spamming glfw windows was solved by setting has_renderer = True
when creating the environment.
Hi again, guys!
I am having a hard time integrating the environment with the baseline algorithms. Do you have any experience with this? And if so, would you care to share a pipeline on how to train and run the environment with the algorithms?
For future references, if someone needs a training and testing pipeline for a custom-made environment
import robosuite as suite
import gym
import numpy as np
from robosuite.environments.base import register_env
from robosuite import load_controller_config
from robosuite.wrappers import GymWrapper
from stable_baselines3 import PPO
from stable_baselines3.common.save_util import save_to_zip_file, load_from_zip_file
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
from my_environments import FetchPush
register_env(FetchPush)
# Training
env = GymWrapper(
suite.make(
'FetchPush',
robots='UR5e',
controller_configs=controller_config,
gripper_types=None,
has_renderer = False,
has_offscreen_renderer= False,
use_camera_obs=False,
use_object_obs=True,
control_freq = 50,
render_camera = None,
horizon = 2000,
reward_shaping = True,
)
)
env = wrap_env(env)
filename = 'test'
model = PPO('MlpPolicy', env, verbose=1, tensorboard_log='./ppo_fetchpush_tensorboard/')
model.learn(total_timesteps=3e5, tb_log_name=filename)
model.save('trained_models/' + filename)
env.save('trained_models/vec_normalize_' + filename + '.pkl') # Save VecNormalize statistics
# Testing
'''
Create identical environment with renderer or override render function in environment to something like this
def render(self, mode=None):
super().render()
'''
env_robo = GymWrapper(
suite.make(
'FetchPush',
robots='UR5e',
controller_configs=controller_config,
gripper_types=None,
has_renderer = True,
has_offscreen_renderer= False,
use_camera_obs=False,
use_object_obs=True,
control_freq = 50,
render_camera = None,
horizon = 2000,
reward_shaping = True
)
)
# Load model
model = PPO.load('trained_models/' + filename)
# Load the saved statistics
env = DummyVecEnv([lambda : env_robo])
env = VecNormalize.load('trained_models/vec_normalize_' + filename + '.pkl', env)
# do not update them at test time
env.training = False
# reward normalization
env.norm_reward = False
obs = env.reset()
while True:
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
env_robo.render()
if done:
obs = env.reset()
env.close()
Hi @hermanjakobsen ,
Apologies for not getting back to this sooner, and glad you were able to get this solved on your own -- would you like to join our slack robosuite workspace? As a deep user, you'd be a great addition to the community and also be able to get a lot quicker feedback for any issues that arise (I check the workspace more frequently than our main issues page).
If you're interested, please check out our contribution guidelines for more info and the slack link.
Thanks for the sample code for integrating robosuite and OpenAI baselines. I have few question regarding the implementation,
env = wrap_env(env)
where do you define wrap_env()
?has_renderer = True
for training env too to avoid spawning multiple glfw windows?Hi @Abhimanyu8713 ,
An updated script which utilizes multiprocessing for training is available here.
To answer your questions:
1) wrap_env(env)
was just a simple function for wrapping the environment in the necessary wrappers. I think it was implemented as
def wrap_env(env):
wrapped_env = Monitor(env) # Needed for extracting eprewmean and eplenmean
wrapped_env = DummyVecEnv([lambda : wrapped_env]) # Needed for all environments (e.g. used for mulit-processing)
wrapped_env = VecNormalize(wrapped_env) # Needed for improving training when using MuJoCo envs?
return wrapped_env
Sorry for not including this in the code above :)
2) I sat has_renderer = False
when training to avoid the spawning glfw windows, as you proposed. I then turned on rendering when testing the policy,
Thanks for the quick response as well as the script. This is exactly I was looking for.
Hey guys!
I would like to test out RL-algorithms on my robosuite environment and the baseline repo from OpenAI offers a collection of RL algorithms.
Do you have any experience with integrating robosuite environments with RL algorithms in the baseline repo? Phrased differently, do you have any tips on how to run the baseline algorithms with my own custom robosuite environment?
I have tried the following with my custom
FetchPush
environmentI also had to add a call method for the
GymWrapper
When I run the code, the RL algorithm initially outputs
Then, the script just starts to repeatedly create and close glfw windows. Any help, thoughts or experiences regarding this topic would be greatly appreciated! :)