google-deepmind / mujoco

Multi-Joint dynamics with Contact. A general purpose physics simulator.
https://mujoco.org
Apache License 2.0
8.27k stars 827 forks source link

Use MjSpec on document with plugin & replicate leads to segfault in Mujoco 3.2.4 #2150

Closed jjyyxx closed 1 month ago

jjyyxx commented 1 month ago

Intro

Hi!

I am a graduate student at HKU, I use MuJoCo for my research on robotic manipulation.

My setup

Mujoco 3.2.4, C or Python (Mujoco 3.2.3 is not affected)

What's happening? What did you expect?

Use MjSpec on document with plugin & replicate leads to segfault in Mujoco 3.2.4

Backtrace

#0  0x00007ffff6a98572 in mjXWriter::Extension (this=0x7fffffffc610, root=0x55555706e220) at mujoco/src/xml/xml_native_writer.cc:1371
#1  0x00007ffff6a8e3ea in mjXWriter::Write[abi:cxx11](char*, unsigned long) (this=0x7fffffffc610, error=0x0, error_sz=0) at mujoco/src/xml/xml_native_writer.cc:895
#2  0x00007ffff6a5f4a4 in WriteXML[abi:cxx11](mjSpec_ const*, char*, int) (spec=0x555556d20ea8, error=0x0, nerror=0) at mujoco/src/xml/xml.cc:411
#3  0x00007ffff6a5b95d in mj_saveXMLString (s=0x555556d20ea8, xml=0x0, xml_sz=0, error=0x0, error_sz=0) at mujoco/src/xml/xml_api.cc:248

Code

  for (const auto& [plugin, slot] : model->ActivePlugins()) {
    if (seen_plugins.find(plugin) == seen_plugins.end()) {
      plugin_elem = InsertEnd(section, "plugin");
      WriteAttrTxt(plugin_elem, "plugin", plugin->name);
    }
  }

Problem source backtrace

#0  mjCModel::operator+= (this=0x555556d206c0, other=...) at mujoco/src/user/user_model.cc:400
#1  0x00007ffff6a01e0e in mjCBody::operator+= (this=0x555556d47f90, other=...) at mujoco/src/user/user_objects.cc:900
#2  0x00007ffff690d926 in mjs_attachFrame (parent=0x555556d48490, child=0x555556d493b8, prefix=0x7ffff6b770b4 "", suffix=0x7fffffffb7e0 "0") at mujoco/src/user/user_api.cc:157
#3  0x00007ffff6a7776f in mjXReader::Body (this=0x7fffffffc090, section=0x555556d196c0, body=0x555556d48490, frame=0x555556d48928, vfs=0x0) at mujoco/src/xml/xml_native_reader.cc:3614
#4  0x00007ffff6a775d2 in mjXReader::Body (this=0x7fffffffc090, section=0x555556d16e40, body=0x555556d1f120, frame=0x0, vfs=0x0) at mujoco/src/xml/xml_native_reader.cc:3594
#5  0x00007ffff6a62b95 in mjXReader::Parse (this=0x7fffffffc090, root=0x555556d16300, vfs=0x0) at mujoco/src/xml/xml_native_reader.cc:955

Code

  // create new plugins and map them
  CopyPlugin(plugins_, other.plugins_, bodies_);
  CopyPlugin(plugins_, other.plugins_, geoms_);
  CopyPlugin(plugins_, other.plugins_, meshes_);
  CopyPlugin(plugins_, other.plugins_, actuators_);
  CopyPlugin(plugins_, other.plugins_, sensors_);
  for (const auto& active_plugin : other.active_plugins_) {
    active_plugins_.emplace_back(active_plugin);
  }

When this and other points to the same model, copying other.active_plugins_ into this->active_plugins_ makes the active_plugins_ size grow exponentially (in fact, inserting while iterating the same vector is already very close to buggy code, but STL seems to realloc successfully for smaller reallocations, so small replication count does not break code.)

Besides, many other copies in this function might be vulnerable to such cases as well, but I do not fully understand the intention of mjs_attachFrame add-assigning the full model, so no further comment.

Steps for reproduction

Run the snippet below. Process segfaults at spec.to_xml.

Minimal model for reproduction

(Embedded in code)

Code required for reproduction

import mujoco
spec = mujoco.MjSpec.from_string("""
<mujoco>
    <extension>
        <plugin plugin="mujoco.sdf.torus">
            <instance name="torus" />
        </plugin>
    </extension>
    <asset>
        <mesh name="torus">
            <plugin instance="torus" />
        </mesh>
    </asset>
    <worldbody>
        <geom type="sdf" mesh="torus">
            <plugin instance="torus" />
        </geom>
        <replicate count="12">
            <body />
        </replicate>
    </worldbody>
</mujoco>
""")
spec.compile()
spec.to_xml()

Confirmations

jjyyxx commented 1 month ago

@saran-t @quagla In fact, while being handy already, MjSpec is quite broken under many edge cases. Currently, I do not have time to submit issue for each of them, just list below:

  1. weld equality, which references body name (body1, body2) defined inside replicate, does not expand to a full list of welds as contact/exclude or actuator/general do, but silently disappears in the output xml.
  2. asset/model + attach is very broken. (a) the output xml contains two levels of nameless <default>, leading to error when loaded back; (b) MjSpec loading raises error if the parent model contains plugin definition; (c) the output xml is full of asset path inconsistencies (this may be harder to deal with)
quagla commented 1 month ago

Thanks. I solved the original issue with plugins.

I take note to add a warning for your point 1, which is a behavior I am aware of (it should work if both bodies are inside replicate, right?)

Could you provide minimal examples for 2a, 2b or 2c by opening other issues when you have time?

jjyyxx commented 1 month ago

@quagla Created issue #2162