Open JaninaMattes opened 1 year ago
Hi @JaninaMattes
thank you for reporting! I am not sure why this would be the case
but in a repeated run the values were no longer normalised Can you show me what were you running to raise this problem/how did you come to the conclusion?
I'd be available to organize a Zoom call if creating a bug report with example is too laborious.
Hello everyone,
first of all, thank you for the opportunity to use this project. I have written my own PPO algorithm and would like to test the takeoff--aviary-v0. I have modified the
learn.py
script under the ./examples folder. In a first test run the values seemed to be normalised per default due to_clipAndNormalizeState(self, state),
call in the Aviary class showing a promising learning result, but in a repeated run the values were no longer normalised and I could not yet figure out how to properly normalise the observation/action space and rewards.Could the lack of normalisation be a result of an incorrect registration of the custom gym environment?
I followed the instructions and run
pip3 install -e .
to register the environments. The pybullet drone environment is then created via:env = gym.make(env_id) env.seed(seed) env.action_space.seed(seed) env.observation_space.seed(seed)
I pass the environment as an argument to my PPOTrainer:
`
trainer = ppo.PPOTrainer( env, total_training_steps=1_000_000)
train PPO
I would be very grateful for any advice!