Closed akeaveny closed 3 years ago
Hi @akeaveny ! Thank you for your kind words and for trying out robo-gym! The gif you sent already helps a lot but I would need some more information to try to help you, I tried to reproduce the problem on my machine but I couldn't do it so far.
I assume you followed the Standard Installation , is this correct?
Are you starting the server manager using $ start-server-manager
or are you manually starting a single simulated robot server with $ ur10_sim_robot_server.launch
?
Could you please paste here every line of code you run? I would need the line with which you start the robot server and the python code snippet of which I can see a part in your gif.
Are you using Ubuntu 18.04 and ROS Melodic?
Thank you and I hope we can get to the bottom of this soon!
Thanks Matteo!
I've answered your questions below.
Yes, I cloned robo-gym in my catkin_ws and robo-gym-robot-servers in a seperate robogym_ws. I also used conda for my python 2.7 & 3.5 envs.
This is my launch sequence:
roslaunch urwt_rl_robot_server uwrt_rl_sim_robot_server.launch
in my robogym_ws. start-server-manager
with python 3.5 env.python td3_script.py
with python 3.5 env.Yes! 18.04 with ROS Melodic.
Here's my script.
import gym
import robo_gym
from robo_gym.wrappers.exception_handling import ExceptionHandling
from stable_baselines3 import TD3
from stable_baselines3.td3 import MlpPolicy
# specify the ip of the machine running the robot-server
target_machine_ip = 'localhost'
robot = 'EndEffectorPositioningUR10Sim-v0'
episodes = 15000
save_model_path = 'td3_mir_basic' + str(episodes)
# initialize environment (to render the environment set gui=True)
env = gym.make(robot, ip=target_machine_ip, gui=True)
# apply the exception handling wrapper for better simulation stability
env = ExceptionHandling(env)
obs = env.reset()
# follow the instructions provided by stable-baselines
model = TD3(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=episodes)
# saving and loading a model
model.save(save_model_path)
print("Done Training..")
del model
model = TD3.load(save_model_path)
print("Loaded model {}..".format(save_model_path))
# TODO: reinitialize simulation environment with rendering enabled
# env.kill_sim()
# env = gym.make(robot, ip=target_machine_ip, gui=True)
# env = ExceptionHandling(env)
# run the environment 10 times using the trained model
num_episodes = 10
for episode in range(num_episodes):
obs = env.reset()
done = False
while not done:
action, _states = model.predict(obs)
obs, reward, done, info = env.step(action)
print("\tEpisode:{}, Reward:{:.2f}".format(episode, reward))
Thank you for your detailed description. I will update the documentation because probably this is not well explained.
When starting the framework you should have 2 terminal windows open:
$ start-server-manager
$ python td3_script.py
Can you try this out and let me know if this works?
Great, this works!
Our plan is to extend this to our robot model.
We've already developed a ROS controller which interfaces with Gazebo or our real robot.
I'm currently working on a ROS bridge based off the UR10 and will reach out if I face any issues.
That sounds great! Don't hesitate to reach out if you need any help!
Once you are done it would be great if you'd share your extension with the community by submitting a pull request here :)
Good luck with your work! :)
Thanks for sharing this repo, it's great work! I also came across pyrobolearn but this is much easier to digest.
I have installed everything correctly but noticed that RVIZ doesn't update correctly. Here's an output of my python random_agent_sim.py
I used the below action and obersved that the reward and /gazebo/link_states were changing.