Hello, I use the PPO method of your program to train the spacerobot, but I meet a problem now. I use the file(PPO/Discrete/PPO/main.py) to train spacerobot, and the xml file is spacerobotstate, the training data just like train_log.txt in your program, but when i use the trained policy to guide the spacerobot to move, it just kill still and don't move, like the photo in the following:
the eva.py is :
import gym
import torch as T
import numpy as np
from agent import Agent
import SpaceRobotEnv
if __name__ == '__main__':
env = gym.make("SatelliteEnv-v0")
n_eval_episodes = 20
action_space = env.action_space.shape[0]
obs_shape = env.observation_space['observation'].shape
agent = Agent(n_actions=action_space,
batch_size=16,
alpha=0.0003,
n_epoch=3,
input_dims=obs_shape,
model_name_actor="space_robot_actor.pt",
model_name_critic="space_robot_critic.pt")
agent.load_model()
score_history = []
for episode in range(n_eval_episodes):
obs = env.reset()
observation = obs["observation"]
done = False
score = 0
while not done:
env.render()
action, _, _ = agent.choose_action(observation)
a = action.reshape(14,)
a = a.clip(env.action_space.low, env.action_space.high)
observation_, reward, done, info = env.step(a)
score += reward
observation = observation_["observation"]
score_history.append(score)
print(f"Episode {episode + 1} Score: {score:.2f}")
avg_score = np.mean(score_history)
print(f"\nAverage Score over {n_eval_episodes} episodes: {avg_score:.2f}")
env.close()
Can you help me to solve this problem? Thank you very much!
Hello, I use the PPO method of your program to train the spacerobot, but I meet a problem now. I use the file(PPO/Discrete/PPO/main.py) to train spacerobot, and the xml file is spacerobotstate, the training data just like train_log.txt in your program, but when i use the trained policy to guide the spacerobot to move, it just kill still and don't move, like the photo in the following: the eva.py is :
Can you help me to solve this problem? Thank you very much!