denisgriaznov / ReinforcementLearningSpyderWithUnityMLAgents

This project shows how to create a robotic spider in Unity and train it to walk using MLAgents, the Gym framework, and ready-made algorithms from StableBaselines3
1 stars 0 forks source link

Robotic Motion Control #1

Closed learning-github-zgy closed 2 months ago

learning-github-zgy commented 2 months ago

Hello, have you tried to use imitation learning in ML Agent for robot motion control? If you have used it, how did you record the demo data of the robot? After all, it's hard for a robot to use a keyboard for motion control!

denisgriaznov commented 2 months ago

Hello! No, I haven't tried imitation learning in Unity, but I don't think there should be much of a problem. Firstly, you can simply create (or download ready-made) animation for the robot, then turn off physics for a while and generate many trajectories by slightly changing the animation curves: https://docs.unity3d.com/Manual/AnimationClips.html Secondly, you can create a simple controller without turning off physics (ArticulationBody already has a PID for speed and position), maybe adding some heuristics. This method will be better if you save the motor control history as an observation.

learning-github-zgy commented 2 months ago

Well, thanks for your detailed answer. I will try the second method you mentioned and design a simple controller to control the robot. Furthermore, I would like to design a controller for a common 19-joint robot, do you have any suggestions? I am currently trying to do is to use the imitation learning in ML-agent to teach the robot to perform some simple actions like climbing up or walking etc. I am still very new to this area and may not be clear about some of the solutions.

denisgriaznov commented 2 months ago

A 19 DOF walking robot is a rather complex system to predict each next step with a conventional controller, but fortunately the tasks you describe are periodic movements. Therefore, I would try to set setpoints for controllers as some periodic functions of time (sines or cosines). It may be possible to quickly select by eye amplitudes and frequencies that correspond to stable motion, and then, by introducing random noise, it is possible to generate many trajectories. In addition, you can try to simplify the problem by introducing a hierarchy of several controllers, but this will require serious knowledge in control theory; this approach is described here https://www.mdpi.com/2076-3417/12/21/11183

learning-github-zgy commented 2 months ago

Thank you very much for your detailed advice! Do you mean that I should design an MPC or some other controller to control the robot to perform certain actions like walking or climbing up, and then record the example files needed for the imitation learning, and then use the imitation learning to train the robot later? This sounds like a solution, but is it possible to implement this in Unity? Is there a unity example of designing a controller to control a 19 degree of freedom robot? Actually, what I eventually want to achieve is to do some complex tasks through imitation learning, like the latest Tesla robot that can sort batteries, etc., but this definitely needs to be done step by step, starting with a simple implementation.

denisgriaznov commented 2 months ago

Generally speaking, I proposed options that would help implement imitation learning. I don't know your goals, perhaps you need this particular method for a school project or as a practice. Imitation learning is well suited for tasks where we can relatively “cheaply” obtain examples of expert behavior. For example, for Large humanoid robots, it is possible to take readings from sensors on living people, since the movements of body parts and joints of the robot can be easily associated. The same can be done in computer games or driving cars/drones. You can try imitation learning in simpler environments/games in Unity where you can act as an expert while controlling an agent. If the task of obtaining expert trajectories is more complex than the learning task itself (for example, you cannot associate the movements of a person or other animal with sensors with your robot, since you yourself cannot effectively control such a robot in real time), it may be worth taking a different approach: what -like PPO or SAC as a baseline that will learn the behavior from scratch. It might also be worth looking towards Model-Based methods.

learning-github-zgy commented 2 months ago

Yes, I couldn't agree with you more. Imitation learning can use the data collected by motion capture suits to allow robots to learn human behavior, which can accelerate robots to perform complex tasks. I've tried simple examples of imitation learning in ML-agent before, and based on that I've started to look at how to get more complex robots to train using imitation learning. I've read some papers on training robots for tasks based on reinforcement learning such as PPO or SAC, but the training takes a long time, especially for complex tasks where the design of the reward function may be complicated, so I'd like to try to use imitation learning for pre-training, which may speed up the subsequent training. Anyway, thanks a lot for the help you gave. The robot in the repo you gave was created using the articulation body component, which is exactly the same thing I'm trying to understand. One of my current small goals is to create a 19 degree of freedom robot in unity using the articulation body and then train it using the PPO algorithm in ML-agent. After that I will learn more about how the articulation body is used to control the robot motion in your repo and try to see how to build more complex robots using this component.