jr-robotics / robo-gym

An open source toolkit for Distributed Deep Reinforcement Learning on real and simulated robots.
https://sites.google.com/view/robo-gym
MIT License
426 stars 75 forks source link

Passing trajectory to mir100 #15

Closed disembarrasing closed 3 years ago

disembarrasing commented 3 years ago

Hi, I'd like to use your framework for testing RL agents as local collision avoidance algorithms. The main idea is to move an agent between subgoals received from A* algorithm. To my understanding, I have to create my own environment and ROS bridge. Do you have any suggestions how to get this going?

matteolucchi commented 3 years ago

Hello and thank you for your interest in robo-gym!

If I got it right you want to do this with the MiR 100, is that correct?

If so, the ROS-bridge provided by us provides Twist and Laser Scanners data and allows to control the robot via the /cmd_vel topic. If you need to add additional sensors or in general a different interface you need to expand the ROS bridge.

Furthermore, as you mentioned you also need to create your own environment.

If you give me some more information on which robot and which sensors you want to implement I can try to give you some more information.

Cheers,

Matteo

disembarrasing commented 3 years ago

Hi, thank you for the reply. I'd prefer to use MiR100, since it's already implemented. Firstly, I'd like to train agents in a simulated environment, then move on to tests on a real robot. What's bothering me though is the rs_state variable, because it's a numpy array (if I recall correctly) - to pass a trajectory, I'd have to replace the rs_state[0:3] values after reaching current target location, but somehow I'm not a fan of this solution. Do you have any idea if it can be solved differently?

matteolucchi commented 3 years ago

Hi, yes I understand what you mean. The environment as it is built right now is supporting a single target point. I see two options here:

1 - Include the full trajectory in the state representation of the environment and train an agent to follow the full trajectory. This would need the design of a different reward function and I am not so sure this would work out fine

2- Build an environment on top of the existing environment, implement a module on top of that that handles passing the trajectory points one at a time

I am sorry but I don't know how to help you more than this, if you find a way and you have more specific questions about the framework please open another issue and I'll be happy to answer to you :)