Closed UltronAI closed 10 months ago
Hi i am trying to learn the elasticity parameters in the positional based dynamics by generating two scenarios, one simulation scenario with the real elasticity parameters and the other one with random elasticity, then compute the loss wrt positions but i am getting elasticity zero elasticity gradient wrt to position, did you get something similar?
Hi, @amine789 While I didn't directly experiment with elasticity parameters, I found some relevant insights in this paper. See the data presented in the final row of Table 1. The authors provided a reasonable explanation in the corresponding section. Hope it's helpful for you!
Hi @UltronAI , thanks for the insightful experiments using brax!
[1] Tossing the ball
the restitution params between spring/positional and generalized are quite different. For spring/positional, they are controlled by the elasticity
param (https://github.com/google/brax/blob/main/brax/io/mjcf.py#L182) and for generalized we use the contact model from MuJoCo. The restitution for generalized is controlled by the solver params, see https://mujoco.readthedocs.io/en/latest/modeling.html#restitution
Other params that may affect restitution in positional are the physics timestep and collide_scale, amongst others.
[2] Swinging the pendulum
I would recommend tuning parameters to maintain stability in positionl//spring. Joint constraints are implicitly maintained in generalized since it uses Featherstone's. For positional/spring, joint constraints are resolved at every time step and are likely to be more unstable, and need to be tuned (ang/linear damping etc.).
It isn't clear if you are training a policy for swinging, and are making stability conclusions based on the final policy or the physics?
[4] training
Thanks for the findings! The contact impulse are backed out through the constraint solver in generalized, and it is thus not too surprising that the gradients may be less useful (autograd through a constraint solve) compared to the simple impulse-based contact updates in spring/positional.
For further reference on the contact model for generalized, see https://mujoco.readthedocs.io/en/latest/computation.html#contact
Hello there,
I have been recently exploring and experimenting with the Brax Basics Colab, training the built-in APG/PPO algorithm and investigating the impact of different physics backends. In the process, I've encountered some intriguing behavior patterns that I'd like to share and discuss.
In this task, the ball is expected to bounce upon after hitting the ground. The default
positional
backend executed this behavior well and produced satisfying results. However, when I employed thegeneralized
backend, the ball merely rolled on the ground, failing to bounce. This response does not correspond to expected physics-based behavior.Here's the visualization with
positional
in Red,generalized
in Green, andspring
in Blue:In this task, the
generalized
backend performed well, providing the expected results. However, thepositional
backend collapsed for all step sizes, and thespring
backend exhibited instability when the step size increased.In this task, only
generalized
worked well. Both thepositional
andspring
backends failed to maintain stability.During my attempts to train an APG agent using these backends, the results varied significantly.
For the Ant task, which has rich contact interactions, APG struggled with the
generalized
backend but exhibited some learning capacity withpositional
andspring
.As for the Reacher task, which is a simpler system, APG with
generalized
outperformed the other two backends.(Update: I find that
reacher
withgeneralized
may lead to NaN gradients as well)The parameters I used for these two experiments:
Interestingly, these results seem to suggest that the
generalized
backend excels when handling contacts but struggles to compute their gradients (or the analytic gradients are useless for training), with the exception of the ball-tossing experiment. On the other hand, thepositional
andspring
backends handle more complex systems efficiently but fail when dealing with simpler systems like the pendulum.(Update: after more experiments with
positional
andspring
, I find that they are unsurprisingly not as good asgeneralized
in some environments and can lead to different learned policies. For instance, inwalker2d
, PPO withgeneralized
learns to walk with two legs, PPO withpositional
learns to walk with only one leg, and PPO withspring
fails to walk, all using the same training parameters.)Moreover, I recently came across a paper that delves into the topic of how varying contact models can produce different outcomes. The authors of this paper conducted a series of experiments utilizing Brax v1, specifically employing the
positional
andlegacy_spring
backends.As per their findings, the discrepancies between the results could be attributed to the different contact models used. This further intrigues me to understand the mechanics of the
generalized
backend, particularly how it models the contacts.To enhance the comprehension of these behaviors and the potential reasons behind the aforementioned inconsistencies, I would greatly appreciate a more in-depth explanation or any resources that could illuminate the underlying workings of the
generalized
backend in contact modeling.I look forward to your insightful responses and thanks for your great work on Brax!