isaac-sim / IsaacLab

Unified framework for robot learning built on NVIDIA Isaac Sim
https://isaac-sim.github.io/IsaacLab
Other
1.85k stars 707 forks source link

[Question] Cuda Error: "Failed to get Jacobians from backend" when running self.robot.root_physx_view.get_jacobians() together with instance segmentation of camera #346

Open YitianShi opened 5 months ago

YitianShi commented 5 months ago

Question

Hi, I'm setting up a robot bin picking simulation with top-down camera that captures semantic or instance segmentation after each grasp attempt. My simulation always get CUDA error: an illegal memory access was encountered when the jacobian of my robot is got from Physx:

self.robot.root_physx_view.get_jacobians()

Other modalities of camera such as rgb, normals and depths are working fine, while only the semantic or instance segmentation will cause such crash.

The similar issue that I found is: https://forums.developer.nvidia.com/t/multiple-isaac-sim-containers-on-one-gpu-fails-with-cuda-illegal-memory-access-in-omni-physx-tensors-plugin/268134

Where I'm also sure that my GPU memory is far enough than exhaustion since I'm using 4 RTX4090GPUs to run only 3 bin-picking environments with only 2 objects in the bin.

The error message looks like:

2024-04-05 09:44:29 [35,035ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 1080
2024-04-05 09:44:29 [35,045ms] [Error] [__main__] CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

2024-04-05 09:44:29 [35,046ms] [Error] [__main__] Traceback (most recent call last):
  File "/home/yitian/Research/simulation_orbit/creat_rl.py", line 66, in <module>
    main()
  File "/home/yitian/Research/simulation_orbit/creat_rl.py", line 58, in main
    simulator.run()
  File "/home/yitian/Research/simulation_orbit/config/rl_env.py", line 178, in run
    self.action_plan()
  File "/home/yitian/Research/simulation_orbit/config/rl_env.py", line 237, in action_plan
    jacobian = self.robot.root_physx_view.get_jacobians()[:, self.ee_jacobi_idx, :, self.robot_entity_cfg.joint_ids]
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 254
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 255
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 256
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 257
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 258
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 259
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 260
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 261
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 262
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 263
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 264
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 265
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 266
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 267
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 268
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 226
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 227
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 228
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 229
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 230
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 231
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 232
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 233
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 226
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 227
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 228
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 229
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 230
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 231
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 232
2024-04-05 09:44:29 [35,047ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 233
2024-04-05 09:44:29 [35,050ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,050ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: 0x7b8b58921890, src: 0x7b8b0de00000, size: 18672).
2024-04-05 09:44:29 [35,050ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,050ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,050ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,050ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: 0x7b9ce8068160, src: 0x7b8b0fc00000, size: 18672).
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] Failed to allocate CUDA device memory (size: 18672).
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in mallocAsyncM: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 236 (device 0)
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: invalid device ordinal: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 18384 bytes from 0x602258400 on device 0 to (nil) on device -2 - kind=3
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] Failed to free CUDA device memory.
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in freeAsync: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 268 (device 0)
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:44:29 [35,051ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpyAsync failed to copy 8 bytes from 0x7b8bbfbb90e0 on device -2 to 0x60225cc00 on device 0 - kind=1
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,051ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: (nil), src: 0x7b8b0dc00000, size: 18672).
2024-04-05 09:44:29 [35,051ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 1228800 bytes from 0x602000000 on device 0 to 0x7b8bc2166030 on device -2 - kind=2
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 1228800 bytes from 0x60212c200 on device 0 to 0x7b8bc2292050 on device -2 - kind=2
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:44:29 [35,052ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 1228800 bytes from 0x60225ce00 on device 0 to 0x7b8bc23be070 on device -2 - kind=2

After setting CUDA_LAUNCH_BLOCKING=1, gives:

2024-04-05 09:49:52 [34,447ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 1069
2024-04-05 09:49:52 [34,447ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 1075
2024-04-05 09:49:52 [34,447ms] [Error] [__main__] Failed to get Jacobians from backend
2024-04-05 09:49:52 [34,448ms] [Error] [__main__] Traceback (most recent call last):
  File "/home/yitian/Research/simulation_orbit/creat_rl.py", line 66, in <module>
    main()
  File "/home/yitian/Research/simulation_orbit/creat_rl.py", line 58, in main
    simulator.run()
  File "/home/yitian/Research/simulation_orbit/config/rl_env.py", line 178, in run
    self.action_plan()
  File "/home/yitian/Research/simulation_orbit/config/rl_env.py", line 237, in action_plan
    jacobian = self.robot.root_physx_view.get_jacobians()[:, self.ee_jacobi_idx, :, self.robot_entity_cfg.joint_ids]
  File "/home/yitian/.local/share/ov/pkg/isaac_sim-2023.1.1/extsPhysics/omni.physics.tensors-105.1.12-5.1/omni/physics/tensors/impl/api.py", line 574, in get_jacobians
    raise Exception("Failed to get Jacobians from backend")
Exception: Failed to get Jacobians from backend

2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 226
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 227
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 228
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 229
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 230
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 231
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 232
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 233
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 226
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 227
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 228
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 229
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 230
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 231
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 232
2024-04-05 09:49:52 [34,449ms] [Error] [omni.physx.tensors.plugin] CUDA error: an illegal memory access was encountered: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuRigidBodyView.cpp: 233
2024-04-05 09:49:52 [34,452ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: 0x7e7920f88c10, src: 0x7e67dfc00000, size: 18672).
2024-04-05 09:49:52 [34,452ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: 0x7e7967f803f0, src: 0x7e67dde00000, size: 18672).
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,452ms] [Error] [carb.cudainterop.plugin] Failed to allocate CUDA device memory (size: 18672).
2024-04-05 09:49:52 [34,452ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in mallocAsyncM: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 236 (device 0)
2024-04-05 09:49:52 [34,452ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: invalid device ordinal: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:49:52 [34,453ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 18384 bytes from 0x602258400 on device 0 to (nil) on device -2 - kind=3
2024-04-05 09:49:52 [34,453ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,453ms] [Error] [carb.cudainterop.plugin] Failed to free CUDA device memory.
2024-04-05 09:49:52 [34,453ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in freeAsync: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 268 (device 0)
2024-04-05 09:49:52 [34,453ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:49:52 [34,453ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpyAsync failed to copy 8 bytes from 0x7e68a9c2a6e0 on device -2 to 0x60225cc00 on device 0 - kind=1
2024-04-05 09:49:52 [34,453ms] [Error] [carb.cudainterop.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,453ms] [Error] [carb.cudainterop.plugin] Failed to copy async (dst: (nil), src: 0x7e67ddc00000, size: 18672).
2024-04-05 09:49:52 [34,453ms] [Error] [omni.syntheticdata.plugin] CUDA error 700: cudaErrorIllegalAddress - an illegal memory access was encountered)
2024-04-05 09:49:52 [34,454ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:49:52 [34,454ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 1228800 bytes from 0x602000000 on device 0 to 0x7e68a9c2e430 on device -2 - kind=2
2024-04-05 09:49:52 [34,454ms] [Error] [omni.gpucompute-cuda.plugin] CUDA error in memcpy: an illegal memory access was encountered: ../../../source/plugins/omni.gpucompute-cuda/GpuCompute-Cuda.cpp: 337 (device 0)
2024-04-05 09:49:52 [34,454ms] [Error] [omni.gpucompute-cuda.plugin] cudaMemcpy failed to copy 1228800 bytes from 0x60212c200 on device 0 to 0x7e68d0abd030 on device -2 - kind=2
pascal-roth commented 5 months ago

As I understood, your environments do not require the multi-GPU setup. Could you try running the code on a single GPU to see if the same error still occurs? In any case, this error seems to be related to Isaac-Sim and TensorAPI instead of Orbit itself. To get better feedback, I would recommend to pose your questions in their channel. I heard from an experience where it solved by switching the GPU model to a 4060

pascal-roth commented 5 months ago

We also contacted the Isaac Sim Team about it a while ago, @Mayankm96 maybe we can ping them again