Closed gabriansa closed 2 years ago
This is not supported in Assistive Gym currently. It would require a few changes to do. First, the active human environments use an rllib interface, rather than a strict gym interface, see: https://github.com/Healthcare-Robotics/assistive-gym/blob/main/assistive_gym/envs/feeding_envs.py#L44
Then, for active human ends, there are actually two policies trained: https://github.com/Healthcare-Robotics/assistive-gym/blob/main/assistive_gym/learn.py#L34 You will want to pull out and use only the policy for the robot, and make sure to set coop=False when loading the policy: https://github.com/Healthcare-Robotics/assistive-gym/blob/main/assistive_gym/learn.py#L32 and ensure that self.human.controllable is False: https://github.com/Healthcare-Robotics/assistive-gym/blob/main/assistive_gym/envs/feeding.py#L13
Thanks for the help. So I am trying to follow the steps you suggest. In order to pull out and only use the policy for the robot I did the following:
in learn.py
def setup_config(env, algo, coop=False, seed=0, extra_configs={}):
num_processes = multiprocessing.cpu_count()
if algo == 'ppo':
config = ppo.DEFAULT_CONFIG.copy()
config['train_batch_size'] = 19200
config['num_sgd_iter'] = 50
config['sgd_minibatch_size'] = 128
config['lambda'] = 0.95
config['model']['fcnet_hiddens'] = [100, 100]
elif algo == 'sac':
# NOTE: pip3 install tensorflow_probability
config = sac.DEFAULT_CONFIG.copy()
config['timesteps_per_iteration'] = 400
config['learning_starts'] = 1000
config['Q_model']['fcnet_hiddens'] = [100, 100]
config['policy_model']['fcnet_hiddens'] = [100, 100]
# config['normalize_actions'] = False
config['num_workers'] = num_processes
config['num_cpus_per_worker'] = 0
config['seed'] = seed
config['log_level'] = 'ERROR'
# if algo == 'sac':
# config['num_workers'] = 1
# HERE THE CHANGES
obs = env.reset()
config['observation_space'] = env.observation_space_robot
config['action_space'] = env.action_space_robot
# if coop:
# obs = env.reset()
# policies = {'robot': (None, env.observation_space_robot, env.action_space_robot, {}), 'human': (None, env.observation_space_human, env.action_space_human, {})}
# config['multiagent'] = {'policies': policies, 'policy_mapping_fn': lambda a: a}
# config['env_config'] = {'num_agents': 2}
return {**config, **extra_configs}
Is that how you pull out the policy for the robot only?
Hi,
would it be possible to run trained policies for active human environments on static human environments?
In other words, imagine if I trained a policy for the environment "FeedingJacoHuman-v1" and now I want to render this policy for the environment "FeedingJaco-v1".
How can I achieve this?
I tried changing the folder name for the trained policy from FeedingJacoHuman-v1 to FeedingJaco-v1 and run the following command:
however, I get the following error:
Thanks a lot.