Trained Rollout Scripts exhibit random behavior

rojas70 commented 3 years ago

As part of the Visualization of Rollout Scripts, I ran the rollout.py script with different log files such as:

python scripts/rollout.py --load_dir runs/Door-Panda-OSC-POSE-SEED129/Door_Panda_OSC_POSE_SEED129_2020_09_21_20_07_20_0000--s-0/ --horizon 200 --camera frontview

However, for this and all other files that I tested, the robot agent seems to only exhibit random behavior.

From the documentation, I thought these would be well-trained agents. The documentation states:

We provide a rollout script for executing and visualizing rollouts using a trained agent model.

Am I making a mistake or missing something here? Thank you

sguysc commented 2 years ago

@rojas70, it's been a while :) but did you figure this out? I'm having the same issue as you, and I'm trying to figure out what's wrong. Here is what I run, I didn't really install robosuite-benchmark nor rlkit :

import torch
import numpy as np
import robosuite
from robosuite.controllers import load_controller_config
from robosuite.wrappers import GymWrapper

has_renderer = True
N_episodes = 1

controller_config = load_controller_config(default_controller="OSC_POSE")

# create environment instance
env_s = robosuite.make(
    env_name="Lift", 
    robots="Panda",  
    gripper_types="default", 
    controller_configs=controller_config,   # arm is controlled using OSC
    has_renderer=has_renderer,
    has_offscreen_renderer=False, # no off-screen rendering
    control_freq=20,                        # 20 hz control for applied actions
    horizon=500,                            
    use_object_obs=True,              
    use_camera_obs=False,          
    reward_shaping=True,              
)

# Make sure we only pass in the proprio and object obs (no images)
keys = ["object-state"]
for idx in range(len(env_s.robots)):
    keys.append(f"robot{idx}_proprio-state")

# Wrap environment so it's compatible with Gym API
env = GymWrapper(env_s, keys=keys)

data = torch.load('Lift_Panda_OSC_POSE_SEED17_2020_09_13_00_26_56_0000--s-0/params.pkl', map_location=torch.device("cpu"))

policy = data['evaluation/policy']
policy.reset()

for i_episode in range(N_episodes):
    obs = env.reset()
    ret = 0.
    done = False
    i = 0
    while not done:
        action, agent_info = policy.get_action(obs)         # use observation to decide on an action
        obs, reward, done, info = env.step(action)  # take action in the environment
        ret += reward
        i += 1
        if(has_renderer):
            env.render()  # render on display

    print("rollout completed with return {}".format(ret))

env.close()
env_s.close()

ivanangjx commented 1 year ago

Hi @rojas70 @sguysc , were you able to figure out the issue? Thanks.

sguysc commented 1 year ago

@ivanangjx, I don't think that I have. I ended up training my own controller with RL. But, also, don't forget you need to set up a specific seed so that the simulation itself will be deterministic. That might be the problem.

np.random.seed(seed)

junhui1997 commented 8 months ago

Hi @sguysc , I followed your code and manually set the seed according to the folder name, but there still exhibit random behavior, do you have any idea? Thanks

ARISE-Initiative / robosuite-benchmark

Trained Rollout Scripts exhibit random behavior #9