Batou1406 / dls_orbit_bat_private

Unified framework for robot learning built on NVIDIA Isaac Sim
https://isaac-orbit.github.io/orbit/
Other
1 stars 0 forks source link

Problem with NaN #19

Open Batou1406 opened 1 month ago

Batou1406 commented 1 month ago

Problem with NaN

Bug Description

Sometimes, the training would crash because of a faulty value in the observation returned by the observation. This bug is very uncommon, and would happen after several hours of training, which makes the debugging hard.

The obersvation would fail the constraint : torch.distributions.constraints.real.check(observation). This is because a NaN value had propagated into the joint torque (u). After inspection, it was because, the swing trajectory contained NaN values and eventhough they weren't used because the leg was in stance (ie $c_i=0$).

The swing trajectory is computed for every leg, no matter if the leg is in swing or stance, but the torque computed for that trajectory should be applied only if the leg is in swing.

In stance, the trajectory is not valid and the math don't make sense. For example, the swing phase is a phase thus sould be $\in [0,1]$, but if the leg is in stance the swing phase doesn't exist but is yet computed and may return a faulty variable. In stance, they may be division by zero, infinite value and negative value for variable that should be positive.

The problem is that those faulty values that shouldn't be used still propageted. The reason is that a NaN multiplied by 0 returns a Nan, Thus, allowing the NaN to propagate to the observation space with the following line and faulty swing torque :

Solution

No good solution has been found yet. For now, filtering the NaN values and replacing them by zero prevents the program from crashig but further investigation are required. We still are unsure if this problem happen only in the case where the torque are not applied (ie. the leg is in swing).

Batou1406 commented 1 month ago

So, after an problem catched with a breakpoint : I can confirm that the NaN comes from a inf multiplied by 0. However, the 'inf' does not come from faulty math, but from a faulty input. In this case, it is the foot touch down position height that is equal to inf, this means the problem actually comes from the height sensor. I will invesitage this problem latter.