google-deepmind / mujoco

Multi-Joint dynamics with Contact. A general purpose physics simulator.
https://mujoco.org
Apache License 2.0
7.85k stars 786 forks source link

[MJX] Memory allocation issue, occurs in my code only after updating to mujoco-mjx version 3.2.1 and 3.2.2 #1918

Closed AlexS28 closed 3 weeks ago

AlexS28 commented 1 month ago

Hi, I have been really enjoying mjx and able to make my biped walk in both simulation and even hardware, so I am excited to share these results in the next few months. However, currently I am having an issue were after updating to mujoco version 3.2.1 or 3.2.2, I have the following error which I copy/pasted below -- it is possible the new version require much more memory resources than previous ones?. Note, I do NOT have this issue when using version 3.2.0 or below, so I don't know what change in the other two version is happening to cause my code to no longer work. Because I'd really like to use the sensor class (which I believe is added in the versions after 3.2.0), it would be great to know what is going on. Feel free to let me know if there are any other info I can provide as I am not sure which info could be helpful to determine the cause, I have a nvidia 4080 graphic card.

2024-08-21 21:58:26.434829: W external/xla/xla/service/hlo_rematerialization.cc:3005] Can't reduce memory use below -795.17MiB (-833792853 bytes) by rematerialization; only reduced to 12.70GiB (13641024129 bytes), down from 12.76GiB (13698962697 bytes) originally 2024-08-21 21:58:39.330747: W external/xla/xla/tsl/framework/bfc_allocator.cc:482] Allocator (GPU_0_bfc) ran out of memory trying to allocate 27.53MiB (rounded to 28868608)requested by op 2024-08-21 21:58:39.331330: W external/xla/xla/tsl/framework/bfc_allocator.cc:494] **************************************************************************************************** E0821 21:58:39.331400 6448 pjrt_stream_executor_client.cc:3067] Execution of replica 0 failed: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 28868608 bytes.

WangHanSolo commented 3 weeks ago

I noticed that mjx.Data uses a lot more memory after dbe18f5. The culprits are the bvh variables. Hope this helps!

erikfrey commented 3 weeks ago

Hello @AlexS28 - @WangHanSolo is correct, this is fixed in head as of 390bce235283cab56df05b5bd1cc90ac58d81e4c and will be part of the next release.

For the time being, if you wish to use the current version of MuJoCo, add this somewhere after loading the model, and before putting or setting data:

model = model.replace(
        nbvh=0,
        nbvhstatic=0,
        nbvhdynamic=0,
)