realquantumcookie / APRL

Efficient Real-World RL for Legged Locomotion via Adaptive Policy Regularization
MIT License
63 stars 5 forks source link

[Question] "Soft" Constraining #2

Open DefinitlyEvil opened 1 month ago

DefinitlyEvil commented 1 month ago

Hello, Thanks for the great research paper, however I found that in the paper you added constraint term in the loss function, in this way, will the robot still shake a lot during initial training since the parameters aren't adapted to the soft constraint yet? Thanks, this have been confusing me for a while. Toby.

realquantumcookie commented 1 month ago

Hi Toby,

Thank you for the question. In our experiments when we put a reasonably big enough hyper-parameter $C$, the robot will only shake in the first 5-10 seconds of initial training since we have a very high UTD ratio.

Best, Yunhao

DefinitlyEvil commented 1 month ago

Ah, gotcha! Thanks, that cleared up my confusion so much! Keep up the good work!

DefinitlyEvil commented 1 month ago

Hi again, sorry for the disruption again. As stated by Levine, I tried to control at 10Hz, but now training takes 0.3s it blocks the running severely (M1 Max). What technology did you use to enable async training? Thanks Yunhao. <3

realquantumcookie commented 1 month ago

Hi! In our experiments we were using an Ubuntu 18.04 LTS machine with an RTX 2070 installed. Not sure if Mac will run the real world control code at all, also make sure the jax library you run has whatever PRJT extension installed for your machine (in your case if you want to use mac you need this). Also we don't have any tested async training code (I guess our hardware was just good enough)

DefinitlyEvil commented 4 weeks ago

Ah you used fully sync code? That was amazing to have 20 gradient updates at 10Hz rate... I built a custom hardware IO library with HTTP API, but I guess the improvement range is huge. I have a RTX4090 PC though maybe I should switch to that. My hardware is a self-designed servo-driven robot, everything is from scratch so things are tricky and challenging. xD Anyway, thanks for sharing your training details and I really love your work. <3 Hope to study with you and Levine some day. Haha! Have a great day!

realquantumcookie commented 3 weeks ago

Yes, we actually had 20 UTD @ 20 Hz. Best of luck with your future endeavors! I will keep this issue opened in case someone has similar questions as yours.