Numerical error in half cheetah simulation

Hi,

I'm a student and I'm trying to use MuJoCo with gymnasium to do model based RL in changing environments.

I'm looking for some help with possible numerical errors while in a simulation.

Im using the original half cheetah from gymnasium and learn a context aware neural network model in a reinforcement learning setting. The control is done by using model predictive control. I change the environment by setting the damping, mass and totalmass at the beginning of the simulation. See: https://gist.github.com/NKPmedia/2b850a8d444c5b1fbddb6b81e9951e2f

The problem is that in a later step of the learning, when the model learned a good dynamic. The simulation produces weird values. The videos look good at the beginning. The half cheetah runs (by spinning) but at step ~ 672of 1000 the simulation goes wild. Out_61.csv are the states values. output This is a plot of the values.

This is a video of the first 300 steps of an episode that had this issue. https://github.com/google-deepmind/mujoco/assets/3307081/907c2512-e791-438f-97aa-148a1332ad4c

How can I identify if it is a numerical issue of mujoco. And how can I get around this, if it is one?

google-deepmind / mujoco

Numerical error in half cheetah simulation #1148