Improbable-AI / dribblebot

Code release accompanying DribbleBot: Dynamic Legged Manipulation in the Wild
https://gmargo11.github.io/dribblebot/
Other
89 stars 8 forks source link

Stability issues #4

Open rubenftech opened 2 months ago

rubenftech commented 2 months ago

Hi,

Thank you for the amazing work!

While experimenting with your code, despite running the training multiple times, we're observing stability issues. Here is an example of one of the rew_total graphs: image

Is this behavior expected or indicative of an underlying problem? Is the maximum total reward achieved here (around 350) the same as you got? Additionally, if you could share the graphs from one of your runs it might help us to track down the issue and understand the expected behavior.

Thanks!

YandongJi commented 2 months ago

Thanks for bringing up the issue! Actually we never tried to train it for 500k, usually 50k at most. As for the curve around 50k, it looks very similar to my curve. The reward scales should be tuned better to make graph look more stable. I can try to tune it in recent days. But can you also evaluate the policy? The policy should usually be performing ok. FYI this work uses the same reward scale and looks like they can have similar results.