google / brax

Massively parallel rigidbody physics simulation on accelerator hardware.
Apache License 2.0
2.25k stars 249 forks source link

Depending on the order of the geoms in the environment, brax can crash. #352

Closed mishmish66 closed 1 year ago

mishmish66 commented 1 year ago

Hello! I am not sure if I am understanding this right, but I have created a mjcf file for the franka emika panda robot, and while debugging various issues during the process, I came across this code snippet from contact.py:

def _geom_pairs(sys: System) -> Iterator[Tuple[Geometry, Geometry]]:
    for i in range(len(sys.geoms)):
        for j in range(i, len(sys.geoms)):
            mask_i, mask_j = sys.geom_masks[i], sys.geom_masks[j]
            if (mask_i & mask_j >> 32) | (mask_i >> 32 & mask_j) == 0:
                continue
            geom_i, geom_j = sys.geoms[i], sys.geoms[j]
            if i == j and geom_i.link_idx is None:
                continue
            if i == j and geom_j.transform.pos.shape[0] == 1:
                continue
            yield (geom_i, geom_j)

From what I understand, this is supposed to iterate through every single possible collision pair. My understanding is that geom_j is supposed to be the only one that can be stationary, and thus have no link_idx.

Because of the structure of my env, not all of my static elements had higher geom ids, so some of these pairs were (i, j) where i is stationary and has no link_idx. This crashes the later code.

A relatively easy fix which seems to work on my local machine is to replace the yield at the end with this:

            if geom_i.link_idx is None and geom_j.link_idx is not None:
                yield (geom_j, geom_i)
            else:
                yield (geom_i, geom_j)

Not sure if this was a known issue or if this solution would mess with the jax compile or anything, but I couldn't find it in existing issues

btaba commented 1 year ago

Hi @mishmish66 , could you provide an XML to reproduce the issue? FWIU both geom_i or geom_j can be a world geometry, but if link_idx is None, it indeed could crash later code (which we should fix if that's the case)