Closed SKYLEO98 closed 1 month ago
Hello, I could not find how we are able to modify the default 1000 episode length when we use "enjoy" command to test a trained agent.
''' python3 enjoy.py --algo tqc --env gym_hexapod_zoo-v0 -f logs/ --exp-id 5 --load-best -n 5000000 '''
hyperparmeter
''' gym_hexapod_zoo-v0: n_timesteps: !!float 2e6 policy: 'MlpPolicy' learning_rate: !!float 3e-4 buffer_size: 100000 batch_size: 256 ent_coef: 'auto' train_freq: 1 gradient_steps: 1 learning_starts: 10000 ''' algorithm is tqc
The case study is to train a hexapod robot to follow a path, for instance an eight-shaped path. The path is generated as follows: ''' self.time = np.linspace(0, 2*np.pi, 1000) def _generate_eight_shape_path(self,time):
scale_x = 4 # Scale factor for x-coordinate
scale_y = 6 # Scale factor for y-coordinate
# Define x and y coordinates for the 8-shaped path with scaled dimensions
x = scale_x * np.sin(time)
y = scale_y * np.sin(time) * np.cos(time)
return x, y
''' Basically, I ignore the time cost as robot could reach the consecutive goals with self.time is 1000. After training, the trained agent will conduct path tracking task to control the hexapod follow that path. As one episode will be determined by multiple of self.time and maximum allowance steps to reach an sub goal based on world Cartesian coordinate (10000 steps for each episode). A time flag will count up to shift the next goal until it reaches 1000 to be reset again. However, the enjoy command always reset after it reach 1000 episode length. I could not get this bug since the actual running did not even complete one episode.
How could I modify this parameter or track how it counted 1000 episodes.
Many thanks in advance.
you mean 1000 steps? you probably defined a max episode steps when registering your env.
That's exactly issue. I almost forgot this configuration. Thanks a lot for your correction.
''' from gymnasium.envs.registration import register register( id="gym_hexapod_zoo-v0", entry_point="gym_hexapod_zoo.envs:gym_hexapod_zoo", max_episode_steps=1000, )
❓ Question
I trained my custom env with tqc algorithm. However, the trained agent keeps reset every 1000 episodes even though I did not set number of episode for a reset. Is it relevant to hyper-parameter or hidden feature that I could tune?
Checklist