rail-berkeley / serl

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning
https://serl-robot.github.io/
MIT License
343 stars 37 forks source link

Variable control frequency #74

Closed zichunxx closed 2 months ago

zichunxx commented 2 months ago

Hi! Thanks for your great work and sharing!

In sec.4.5 of your paper, the variable control frequency is proposed to realize precise movement control.

However, I found this idea does not seem to be implemented in this repo because the control_dt and physics_dt are both fixed.

As shown in

https://github.com/rail-berkeley/serl/blob/21ff8a018d77ac8ee8505cfda11c567702ee70b0/franka_sim/franka_sim/mujoco_gym_env.py#L34-L36

and

https://github.com/rail-berkeley/serl/blob/21ff8a018d77ac8ee8505cfda11c567702ee70b0/franka_sim/franka_sim/envs/panda_pick_gym_env.py#L205-L217

, the step function does not change the loop times to control the run times of the downstream controller.

Looking forward to your reply! Thanks in advance!

jianlanluo commented 2 months ago

I am not sure what you mean by variable control frequency, dt is fixed for all experiments. Section 4.3 is about reset free RL

On Tue, Aug 27, 2024 at 4:33 AM Zichun Xu @.***> wrote:

Hi! Thanks for your great work and sharing!

In sec.4.3 of your paper, the variable control frequency is proposed to realize precise movement control.

However, I found this idea does not seem to be implemented in this repo because the control_dt and physics_dt are both fixed.

As shown in

https://github.com/rail-berkeley/serl/blob/21ff8a018d77ac8ee8505cfda11c567702ee70b0/franka_sim/franka_sim/mujoco_gym_env.py#L34-L36

and

https://github.com/rail-berkeley/serl/blob/21ff8a018d77ac8ee8505cfda11c567702ee70b0/franka_sim/franka_sim/envs/panda_pick_gym_env.py#L205-L217

, the step function does not change the loop times to control the run times of the downstream controller.

Looking forward to your reply! Thanks in advance!

— Reply to this email directly, view it on GitHub https://github.com/rail-berkeley/serl/issues/74, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZPFYWYJP5JVD3DKFA77RTZTRPYDAVCNFSM6AAAAABNF5TFV2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4DSMBXGM3DQOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zichunxx commented 2 months ago

Thanks for your reply. @jianlanluo

Sorry for the typo, I mean Section 4.5 impedance controller for contact-rich tasks.

In this section, the controller frequency is adjusted to realize small actions instead of changing the policy output.

However, all related params are all fixed, such as _n_substeps and control_dt.

jianlanluo commented 2 months ago

Thats not the controller frequency, its bounding the reference from the impedance controller

On Tue, Aug 27, 2024 at 4:57 AM Zichun Xu @.***> wrote:

Thanks for your reply.

Sorry for the typo, I mean Section 4.5 impedance controller for contact-rich tasks.

In this section, the controller frequency is adjusted to realize small actions instead of changing the policy output.

However, all related params are all fixed, such as _n_substeps and control_dt .

— Reply to this email directly, view it on GitHub https://github.com/rail-berkeley/serl/issues/74#issuecomment-2312362375, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZPFYT26AOLNHNEDFYKYUDZTRSQ7AVCNFSM6AAAAABNF5TFV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJSGM3DEMZXGU . You are receiving this because you commented.Message ID: @.***>

zichunxx commented 2 months ago

Sorry for still not catching your point. What is the meaning of the reference? Is reference bound byaction_scale?

If I was wrong, could you point out exactly where this idea is implemented? Thanks!

jianlanluo commented 2 months ago

Its clamping the reference set by upstream RL, detailed in section 4.5. Nothing to do with frequency

On Tue, Aug 27, 2024 at 5:36 AM Zichun Xu @.***> wrote:

Sorry for still not catching your point. What is the meaning of the reference? Is thataction_scale?

If I was wrong, could you point out exactly where this idea is implemented? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/rail-berkeley/serl/issues/74#issuecomment-2312445364, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZPFYTTULF6PAG2OUJMHG3ZTRXE5AVCNFSM6AAAAABNF5TFV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJSGQ2DKMZWGQ . You are receiving this because you were mentioned.Message ID: @.***>

zichunxx commented 2 months ago

Thanks for your patience.

One last question hope you can answer. If the action_scale is taken to clamp the reference? Thanks!

jianlanluo commented 2 months ago

https://github.com/rail-berkeley/serl_franka_controllers

zichunxx commented 1 month ago

Hi! @jianlanluo

Is reference limiting only available for training with the real robot? Has the reference limiting been implemented in the simulation with Mujoco? I want to learn this trick and migrate it to another robot, but I don't have a Franka robot available.

I'm not sure I understand this trick. I think below code snippet is the corresponding implementation, is that right?

https://github.com/rail-berkeley/serl/blob/21ff8a018d77ac8ee8505cfda11c567702ee70b0/franka_sim/franka_sim/controllers/opspace.py#L23-L28

Thanks!

jianlanluo commented 1 month ago

Only with Franka, in sim you dont need this

On Thu, Sep 19, 2024 at 8:09 PM Zichun Xu @.***> wrote:

Hi! @jianlanluo https://github.com/jianlanluo

Is reference limiting only available for training with the real robot? Has the reference limiting been implemented in the simulation with Mujoco? I want to learn this trick and migrate it to another robot, but I don't have a Franka robot available.

Thanks!

— Reply to this email directly, view it on GitHub https://github.com/rail-berkeley/serl/issues/74#issuecomment-2362672479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEZPFYQPENK67SFCT22BAK3ZXOGYNAVCNFSM6AAAAABNF5TFV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRSGY3TENBXHE . You are receiving this because you were mentioned.Message ID: @.***>

zichunxx commented 1 month ago

Thanks for your reply!