PaulDanielML / MuJoCo_RL_UR5

A MuJoCo/Gym environment for robot control using Reinforcement Learning. The task of agents in this environment is pixel-wise prediction of grasp success chances.
MIT License
413 stars 54 forks source link

PID controllers #2

Closed hai-h-nguyen closed 3 years ago

hai-h-nguyen commented 4 years ago

Why your joint controllers have to run repetitively until reaching their setpoints? Usually, these controllers constantly run a much higher frequency in the inner loop while their setpoints are set from an outer loop which often runs at a slower rate. Therefore, the controllers do not have to reach their setpoints before the setpoints are changed. But it seems you are not following this structure.

PaulDanielML commented 4 years ago

Hi,

can you elaborate on what you mean by "the controllers do not have to reach their setpoints before the setpoints are changed"? What I am currently doing is setting the setpoints to the desired values once when calling move_group_to_joint_target. I then let the joints actuate until their angles are within a certain tolerance of the targets. I am aware that the class is far from optimal in terms of control, but it works fine for me in terms of getting the robot where I want it to go. When transferring this setup to a real robotic arm, a different controller would obviously have to be used :)

hai-h-nguyen commented 4 years ago

Hi,

So in your step function, joint controllers work until they reach the setpoints or the timeout is passed and then they break. But for the real robot, I think these controllers often work continuously without a break. They are often running in a different thread I think.

PaulDanielML commented 4 years ago

Ok, I see what you mean. I have not gotten around to implementing a multi-threaded version of the environment yet. It works fine as it is now, if you just want the arm to hold position, you can use the "stay" method. It would be very nice though to have the control part run continuously in parallel, so it would be possible already start simulating the next step on the cpu while the network is updating on the gpu.