isaac-sim / OmniIsaacGymEnvs

Reinforcement Learning Environments for Omniverse Isaac Gym
Other
762 stars 203 forks source link

undefined symbol: cuDeviceGetCount #145

Open VladimirFokow opened 3 months ago

VladimirFokow commented 3 months ago
PYTHON_PATH scripts/rlgames_train.py task=Cartpole headless=True

-> leads to:

2024-03-08 08:34:07 [145,334ms] [Warning] [omni.physx.plugin] PhysX warning: GPU solver pipeline failed, switching to software, FILE /buildAgent/work/eb2f45c4acc808a0/physx/source/simulationcontroller/src/ScScene.cpp, LINE 819
2024-03-08 08:34:07 [145,334ms] [Warning] [omni.physx.plugin] PhysX warning: GPU Bp pipeline failed, switching to software, FILE /buildAgent/work/eb2f45c4acc808a0/physx/source/simulationcontroller/src/ScScene.cpp, LINE 827
/isaac-sim/kit/python/bin/python3: symbol lookup error: /isaac-sim/extsPhysics/omni.physx-105.1.12-5.1/bin/libomni.physx.plugin.so: undefined symbol: cuDeviceGetCount
There was an error running python

I am trying to run it on a node with 4 A100-SXM4-80GB GPUs,
inside of a charliecloud container (the --nvidia flag was successfully injected into the container).


Full output:
``` PYTHON_PATH scripts/rlgames_train.py task=Cartpole headless=True /isaac-sim/extscache/omni.pip.torch-2_0_1-2.0.2+105.1.lx64/torch-2-0-1/torch/utils/tensorboard/__init__.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if not hasattr(tensorboard, "__version__") or LooseVersion( Starting kit application with the following args: ['/isaac-sim/exts/omni.isaac.kit/omni/isaac/kit/simulation_app.py', '/isaac-sim/apps/omni.isaac.sim.python.gym.headless.kit', '--/app/tokens/exe-path=/isaac-sim/kit', '--/persistent/app/viewport/displayOptions=3094', '--/rtx/materialDb/syncLoads=True', '--/rtx/hydra/materialSyncLoads=True', '--/omni.kit.plugin/syncUsdLoads=True', '--/app/renderer/resolution/width=1280', '--/app/renderer/resolution/height=720', '--/app/window/width=1440', '--/app/window/height=900', '--/renderer/multiGpu/enabled=True', '--/app/fastShutdown=True', '--ext-folder', '/isaac-sim/exts', '--ext-folder', '/isaac-sim/apps', '--/physics/cudaDevice=0', '--portable', '--no-window', '--allow-root'] Passing the following args to the base kit application: ['task=Cartpole', 'headless=True'] [Info] [carb] Logging to file: /isaac-sim/kit/logs/Kit/Isaac-Sim/2023.1/kit_20240308_083141.log 2024-03-08 08:31:41 [0ms] [Warning] [omni.kit.app.plugin] No crash reporter present, dumps uploading isn't available. 2024-03-08 08:31:41 [14ms] [Warning] [omni.ext.plugin] [ext: omni.kit.sequencer.usd-103.4.0+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. 2024-03-08 08:31:41 [15ms] [Warning] [omni.ext.plugin] [ext: omni.kit.sequencer.core-103.4.0+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. 2024-03-08 08:31:41 [15ms] [Warning] [omni.ext.plugin] [ext: omni.kit.widget.timeline-105.0.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. 2024-03-08 08:31:41 [15ms] [Warning] [omni.ext.plugin] [ext: omni.kit.window.sequencer-103.4.2-dev.3+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. 2024-03-08 08:31:41 [15ms] [Warning] [omni.ext.plugin] [ext: omni.paint.brush.attributes-1.3.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. 2024-03-08 08:31:41 [16ms] [Warning] [omni.ext.plugin] [ext: omni.usd.schema.sequence-2.3.0+105.0.lx64.r.cp310] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended. [0.043s] [ext: omni.kit.async_engine-0.0.0] startup [0.290s] [ext: omni.assets.plugins-0.0.0] startup [0.291s] [ext: omni.stats-0.0.0] startup [0.292s] [ext: omni.client-1.0.1] startup [0.303s] [ext: omni.gpu_foundation-0.0.0] startup [0.311s] [ext: omni.rtx.shadercache.vulkan-1.0.0] startup [0.313s] [ext: carb.windowing.plugins-1.0.0] startup 2024-03-08 08:31:42 [301ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:42 [301ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) [0.319s] [ext: omni.kit.renderer.init-0.0.0] startup 2024-03-08 08:31:42 [358ms] [Warning] [omni.platforminfo.plugin] failed to open the default display. Can't verify X Server version. 2024-03-08 08:31:43 [1,646ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:43 [1,646ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) 2024-03-08 08:31:43 [1,647ms] [Error] [carb.glinterop.plugin] GLInteropContext::init: carb::windowing is not available |---------------------------------------------------------------------------------------------| | Driver Version: 525.105.17 | Graphics API: Vulkan |=============================================================================================| | GPU | Name | Active | LDA | GPU Memory | Vendor-ID | LUID | | | | | | | Device-ID | UUID | | | | | | | Bus-ID | | |---------------------------------------------------------------------------------------------| | 0 | NVIDIA A100-SXM4-80GB | Yes: 0 | | 81920 MB | 10de | 0 | | | | | | | 20b2 | 6dc5bedc.. | | | | | | | 1 | | |---------------------------------------------------------------------------------------------| | 1 | NVIDIA A100-SXM4-80GB | Yes: 1 | | 81920 MB | 10de | 0 | | | | | | | 20b2 | 2554dda3.. | | | | | | | 41 | | |---------------------------------------------------------------------------------------------| | 2 | NVIDIA A100-SXM4-80GB | Yes: 2 | | 81920 MB | 10de | 0 | | | | | | | 20b2 | 729b9ffe.. | | | | | | | 81 | | |---------------------------------------------------------------------------------------------| | 3 | NVIDIA A100-SXM4-80GB | Yes: 3 | | 81920 MB | 10de | 0 | | | | | | | 20b2 | afeed08a.. | | | | | | | c1 | | |=============================================================================================| | OS: 22.04.3 LTS (Jammy Jellyfish) ubuntu, Version: 22.04.3, Kernel: 5.15.0-97-generic | Processor: AMD EPYC 7443 24-Core Processor | Cores: 48 | Logical: 96 |---------------------------------------------------------------------------------------------| | Total Memory (MB): 1031911 | Free Memory: 1019824 | Total Page/Swap (MB): 8191 | Free Page/Swap: 8191 |---------------------------------------------------------------------------------------------| 2024-03-08 08:31:46 [4,949ms] [Warning] [gpu.foundation.plugin] ECC is enabled for device 0. This will reduce rendering performance. 2024-03-08 08:31:46 [4,958ms] [Warning] [gpu.foundation.plugin] ECC is enabled for device 1. This will reduce rendering performance. 2024-03-08 08:31:46 [4,968ms] [Warning] [gpu.foundation.plugin] ECC is enabled for device 2. This will reduce rendering performance. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] ECC is enabled for device 3. This will reduce rendering performance. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] IOMMU is enabled. Found 132 items in /sys/kernel/iommu_groups/. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] An input-output memory management unit (IOMMU) appears to be enabled on this system. On bare-metal Linux systems, CUDA and the display driver do not support IOMMU-enabled PCIe peer to peer memory copy. If you are on a bare-metal Linux system, please disable the IOMMU. Otherwise you risk image corruption and program instability. This typically can be controlled via BIOS settings (Intel Virtualization Technology for Directed I/O (VT-d) or AMD I/O Virtualization Technology (AMD-Vi)) and kernel parameters (iommu, intel_iommu, amd_iommu). Note that in virtual machines with GPU pass-through (vGPU) the IOMMU needs to be enabled. Since we can not reliably detect whether this system is bare-metal or a virtual machine, we show this warning in any case when an IOMMU appears to be enabled. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] ----------------------------------------------------------------------- 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] !!!!! Local system validation failed! Incorrect configuration detected. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] Summary below. Details above. 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] ----------------------------------------------------------------------- 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] - ECC: FAILED 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] - IOMMU: FAILED 2024-03-08 08:31:46 [4,979ms] [Warning] [gpu.foundation.plugin] 2024-03-08 08:31:46 [4,980ms] [Warning] [gpu.foundation.plugin] ----------------------------------------------------------------------- [5.908s] [ext: omni.kit.pipapi-0.0.0] startup [5.912s] [ext: omni.kit.pip_archive-0.0.0] startup [5.912s] [ext: omni.pip.compute-1.2.0] startup [5.913s] [ext: omni.pip.torch-2_0_1-2.0.2] startup [6.007s] [ext: omni.pip.cloud-1.0.1] startup [6.017s] [ext: omni.isaac.core_archive-2.2.1] startup [6.017s] [ext: omni.isaac.ml_archive-1.1.3] startup [6.017s] [ext: omni.kit.telemetry-0.5.0] startup [6.058s] [ext: omni.mtlx-0.1.0] startup [6.059s] [ext: omni.usd.config-1.0.3] startup [6.065s] [ext: omni.gpucompute.plugins-0.0.0] startup [6.065s] [ext: omni.usd.libs-1.0.0] startup [6.166s] [ext: omni.kit.loop-isaac-1.1.0] startup [6.166s] [ext: omni.kit.test-0.0.0] startup [6.167s] [ext: omni.usd.schema.physics-0.0.0] startup 2024-03-08 08:31:47 [6,160ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.usd.libs/pxr/Tf/__DOC.py:290: DeprecationWarning: invalid escape sequence '\ ' result["MallocTag"].GetCallTree.func_doc = """**classmethod** GetCallTree(tree, skipRepeated) -> bool /isaac-sim/kit/exts/omni.usd.libs/pxr/Tf/__DOC.py:290: DeprecationWarning: invalid escape sequence '\ ' result["MallocTag"].GetCallTree.func_doc = """**classmethod** GetCallTree(tree, skipRepeated) -> bool 2024-03-08 08:31:48 [6,241ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.usd.libs/pxr/Sdf/__DOC.py:11: DeprecationWarning: invalid escape sequence '\p' result["AssetPath"].__init__.func_doc = """__init__() /isaac-sim/kit/exts/omni.usd.libs/pxr/Sdf/__DOC.py:11: DeprecationWarning: invalid escape sequence '\p' result["AssetPath"].__init__.func_doc = """__init__() [6.317s] [ext: omni.usd.schema.physx-0.0.0] startup [6.344s] [ext: omni.usd.schema.anim-0.0.0] startup 2024-03-08 08:31:48 [6,339ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.usd.libs/pxr/UsdSkel/__DOC.py:500: DeprecationWarning: invalid escape sequence '\e' result["AnimMapper"].Remap.func_doc = """Remap(source, target, elementSize, defaultValue) -> bool /isaac-sim/kit/exts/omni.usd.libs/pxr/UsdSkel/__DOC.py:500: DeprecationWarning: invalid escape sequence '\e' result["AnimMapper"].Remap.func_doc = """Remap(source, target, elementSize, defaultValue) -> bool [6.368s] [ext: omni.usd.schema.audio-0.0.0] startup [6.371s] [ext: omni.usd.schema.geospatial-0.0.0] startup [6.374s] [ext: omni.usd.schema.semantics-0.0.0] startup [6.377s] [ext: omni.usd.schema.omniscripting-1.0.0] startup [6.382s] [ext: omni.usd.schema.omnigraph-1.0.0] startup [6.387s] [ext: omni.kvdb-0.0.0] startup [6.389s] [ext: omni.usd_resolver-1.0.1] startup [6.394s] [ext: omni.localcache-0.0.0] startup [6.395s] [ext: omni.usd.core-1.1.8] startup [6.399s] [ext: omni.physx.foundation-105.1.12-5.1] startup [6.399s] [ext: usdrt.scenegraph-7.2.34] startup [6.438s] [ext: omni.hydra.scene_delegate-0.3.2] startup [6.444s] [ext: omni.resourcemonitor-105.0.0] startup [6.446s] [ext: omni.activity.core-1.0.1] startup [6.447s] [ext: omni.appwindow-1.1.5] startup 2024-03-08 08:31:48 [6,434ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,434ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) [6.452s] [ext: omni.hydra.usdrt_delegate-7.2.34] startup [6.486s] [ext: omni.kit.renderer.core-0.0.0] startup 2024-03-08 08:31:48 [6,476ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,476ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) 2024-03-08 08:31:48 [6,538ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,538ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) [6.556s] [ext: omni.usdphysics-105.1.12-5.1] startup [6.559s] [ext: omni.kit.renderer.capture-0.0.0] startup [6.561s] [ext: omni.kit.numpy.common-0.1.2] startup [6.563s] [ext: omni.kit.renderer.imgui-0.0.0] startup 2024-03-08 08:31:48 [6,558ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,558ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) 2024-03-08 08:31:48 [6,567ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,567ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) 2024-03-08 08:31:48 [6,572ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:48 [6,572ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) [6.660s] [ext: omni.kit.actions.core-1.0.0] startup [6.662s] [ext: omni.ui-2.18.5] startup [6.678s] [ext: omni.graph.exec-0.3.0] startup [6.679s] [ext: omni.kit.commands-1.4.6] startup [6.689s] [ext: omni.kit.window.popup_dialog-2.0.23] startup [6.698s] [ext: carb.audio-0.1.0] startup [6.723s] [ext: omni.kit.exec.core-0.5.0] startup [6.725s] [ext: omni.timeline-1.0.9] startup [6.727s] [ext: omni.kit.widget.nucleus_connector-1.1.4] startup [6.734s] [ext: omni.kit.audiodeviceenum-1.0.1] startup [6.736s] [ext: omni.convexdecomposition-105.1.12-5.1] startup [6.740s] [ext: omni.usd-1.10.18] startup 2024-03-08 08:31:48 [6,753ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.usd/omni/usd/_impl/utils.py:1003: DeprecationWarning: invalid escape sequence '\d' match = re.search("_(\d+)$", path) /isaac-sim/kit/exts/omni.usd/omni/usd/_impl/utils.py:1003: DeprecationWarning: invalid escape sequence '\d' match = re.search("_(\d+)$", path) [6.832s] [ext: omni.physx.cooking-105.1.12-5.1] startup [6.839s] [ext: omni.iray.libs-0.0.0] startup [6.842s] [ext: omni.physx-105.1.12-5.1] startup 2024-03-08 08:31:48 [6,842ms] [Warning] [omni.stageupdate.plugin] Deprecated: direct use of IStageUpdate callbacks is deprecated. Use IStageUpdate::getStageUpdate instead. [6.858s] [ext: omni.mdl.neuraylib-0.2.0] startup [6.860s] [ext: omni.isaac.dynamic_control-1.2.6] startup [6.879s] [ext: omni.kit.widget.versioning-1.4.6] startup [6.888s] [ext: omni.mdl-0.2.1] startup [7.015s] [ext: omni.kit.helper.file_utils-0.1.6] startup [7.078s] [ext: omni.kit.widget.nucleus_info-1.0.2] startup [7.080s] [ext: omni.kit.widget.path_field-2.0.8] startup [7.083s] [ext: omni.kit.widget.filebrowser-2.3.35] startup 2024-03-08 08:31:48 [7,071ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.kit.widget.filebrowser/omni/kit/widget/filebrowser/model.py:378: DeprecationWarning: invalid escape sequence '\.' compiled_regex = re.compile(r'([\w.-]+)(.)('+numbers+')\.('+types+')$') /isaac-sim/kit/exts/omni.kit.widget.filebrowser/omni/kit/widget/filebrowser/model.py:378: DeprecationWarning: invalid escape sequence '\.' compiled_regex = re.compile(r'([\w.-]+)(.)('+numbers+')\.('+types+')$') [7.106s] [ext: omni.kit.search_core-1.0.5] startup [7.108s] [ext: omni.kit.widget.browser_bar-2.0.9] startup [7.110s] [ext: omni.kit.widget.search_delegate-1.0.4] startup [7.114s] [ext: omni.ui.scene-1.7.0] startup [7.120s] [ext: omni.kit.notification_manager-1.0.6] startup [7.125s] [ext: omni.kit.clipboard-1.0.3] startup [7.125s] [ext: omni.isaac.kit-1.4.7] startup [7.126s] [ext: omni.kit.window.filepicker-2.10.13] startup [7.162s] [ext: omni.kit.usd.layers-2.1.27] startup [7.189s] [ext: omni.physics.tensors-0.1.0] startup [7.203s] [ext: omni.kit.window.file_importer-1.0.22] startup [7.206s] [ext: omni.kit.menu.utils-1.5.6] startup 2024-03-08 08:31:48 [7,198ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.kit.renderer.imgui/omni/kit/ui/editor_menu.py:120: DeprecationWarning: invalid escape sequence '\/' menu_parts = menu_path.replace("\/", "@TEMPSLASH@").split("/") /isaac-sim/kit/exts/omni.kit.renderer.imgui/omni/kit/ui/editor_menu.py:120: DeprecationWarning: invalid escape sequence '\/' menu_parts = menu_path.replace("\/", "@TEMPSLASH@").split("/") [7.227s] [ext: omni.physx.tensors-0.1.0] startup [7.239s] [ext: omni.warp.core-1.0.0-beta.2] startup Warp 1.0.0-beta.2 initialized: CUDA Toolkit: 11.5, Driver: 12.0 Devices: "cpu" | x86_64 "cuda:0" | NVIDIA A100-SXM4-80GB (sm_80) "cuda:1" | NVIDIA A100-SXM4-80GB (sm_80) "cuda:2" | NVIDIA A100-SXM4-80GB (sm_80) "cuda:3" | NVIDIA A100-SXM4-80GB (sm_80) Kernel cache: /scratch/fokow/cache/warp/1.0.0-beta.2 [7.425s] [ext: omni.kit.material.library-1.3.41] startup [7.447s] [ext: omni.isaac.version-1.0.3] startup [7.448s] [ext: omni.kit.window.title-1.1.3] startup 2024-03-08 08:31:49 [7,437ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:49 [7,437ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) 2024-03-08 08:31:49 [7,442ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed. 2024-03-08 08:31:49 [7,442ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin) [7.459s] [ext: omni.isaac.core-3.3.2] startup 2024-03-08 08:31:49 [7,935ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/deformable_mesh_utils.py:21: DeprecationWarning: invalid escape sequence '\d' flt_grp = "([-+]?\d*\.\d+|\d+)" /isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/deformable_mesh_utils.py:21: DeprecationWarning: invalid escape sequence '\d' flt_grp = "([-+]?\d*\.\d+|\d+)" 2024-03-08 08:31:49 [7,935ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/deformable_mesh_utils.py:22: DeprecationWarning: invalid escape sequence '\s' v_pat = re.compile(r"^\s*[vV]\s*" + flt_grp + "\s*" + flt_grp + "\s*" + flt_grp + "\s*$") /isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/deformable_mesh_utils.py:22: DeprecationWarning: invalid escape sequence '\s' v_pat = re.compile(r"^\s*[vV]\s*" + flt_grp + "\s*" + flt_grp + "\s*" + flt_grp + "\s*$") [7.973s] [ext: omni.isaac.cloner-0.7.2] startup [7.976s] [ext: omni.physx.fabric-105.1.12-5.1] startup [7.983s] [ext: omni.kit.mainwindow-1.0.1] startup [7.985s] [ext: omni.isaac.gym-0.10.0] startup [7.985s] [ext: omni.isaac.sim.python.gym.headless-2023.1.1] startup [7.986s] Simulation App Starting 2024-03-08 08:31:49 [7,982ms] [Error] [omni.kit.app._impl] [py stderr]: /isaac-sim/kit/exts/omni.usd.libs/pxr/PxOsd/__DOC.py:270: DeprecationWarning: invalid escape sequence '\w' result["MeshTopologyValidation"].__doc__ = """ /isaac-sim/kit/exts/omni.usd.libs/pxr/PxOsd/__DOC.py:270: DeprecationWarning: invalid escape sequence '\w' result["MeshTopologyValidation"].__doc__ = """ 2024-03-08 08:31:49 [8,023ms] [Error] [omni.physx.plugin] PhysX error: Could not load libcuda.so: libcuda.so: cannot open shared object file: No such file or directory , FILE /buildAgent/work/eb2f45c4acc808a0/physx/source/physx/src/gpu/PxPhysXGpuModuleLoader.cpp, LINE 200 [8.087s] app ready 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : Your NVIDIA driver supports CUDA version up to 12.0, but libcuda.so cannot be found. 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : GPU 1 (NVIDIA A100-SXM4-80GB) with CUDA compute capability 8.0 is unsupported by this version of iray photoreal. 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : GPU 2 (NVIDIA A100-SXM4-80GB) with CUDA compute capability 8.0 is unsupported by this version of iray photoreal. 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : GPU 3 (NVIDIA A100-SXM4-80GB) with CUDA compute capability 8.0 is unsupported by this version of iray photoreal. 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : GPU 4 (NVIDIA A100-SXM4-80GB) with CUDA compute capability 8.0 is unsupported by this version of iray photoreal. 2024-03-08 08:31:50 [8,292ms] [Warning] [rtx.neuraylib.plugin] [IRAY:RENDER] 1.1 IRAY rend warn : There is no CUDA-capable GPU available to the iray photoreal renderer. 2024-03-08 08:31:50 [8,293ms] [Warning] [rtx.neuraylib.plugin] [IRT:RENDER] 1.1 IRT rend warn : Your NVIDIA driver API 'libcuda.so' cannot be found. All GPUs will be disabled. [8.642s] Simulation App Startup Complete task_name: Cartpole experiment: num_envs: seed: 42 torch_deterministic: False max_iterations: physics_engine: physx pipeline: gpu sim_device: gpu device_id: 0 rl_device: cuda:0 multi_gpu: False num_threads: 4 solver_type: 1 test: False checkpoint: evaluation: False headless: True enable_livestream: False mt_timeout: 300 enable_recording: False recording_interval: 2000 recording_length: 100 recording_fps: 30 recording_dir: wandb_activate: False wandb_group: wandb_name: Cartpole wandb_entity: wandb_project: omniisaacgymenvs kit_app: warp: False task: name: Cartpole physics_engine: physx env: numEnvs: 512 envSpacing: 4.0 resetDist: 3.0 maxEffort: 400.0 clipObservations: 5.0 clipActions: 1.0 controlFrequencyInv: 2 sim: dt: 0.0083 use_gpu_pipeline: True gravity: [0.0, 0.0, -9.81] add_ground_plane: True add_distant_light: False use_fabric: True enable_scene_query_support: False disable_contact_processing: False enable_cameras: False default_physics_material: static_friction: 1.0 dynamic_friction: 1.0 restitution: 0.0 physx: worker_thread_count: 4 solver_type: 1 use_gpu: True solver_position_iteration_count: 4 solver_velocity_iteration_count: 0 contact_offset: 0.02 rest_offset: 0.001 bounce_threshold_velocity: 0.2 friction_offset_threshold: 0.04 friction_correlation_distance: 0.025 enable_sleeping: True enable_stabilization: True max_depenetration_velocity: 100.0 gpu_max_rigid_contact_count: 524288 gpu_max_rigid_patch_count: 81920 gpu_found_lost_pairs_capacity: 1024 gpu_found_lost_aggregate_pairs_capacity: 262144 gpu_total_aggregate_pairs_capacity: 1024 gpu_max_soft_body_contacts: 1048576 gpu_max_particle_contacts: 1048576 gpu_heap_capacity: 67108864 gpu_temp_buffer_capacity: 16777216 gpu_max_num_partitions: 8 Cartpole: override_usd_defaults: False enable_self_collisions: False enable_gyroscopic_forces: True solver_position_iteration_count: 4 solver_velocity_iteration_count: 0 sleep_threshold: 0.005 stabilization_threshold: 0.001 density: -1 max_depenetration_velocity: 100.0 contact_offset: 0.02 rest_offset: 0.001 train: params: seed: 42 algo: name: a2c_continuous model: name: continuous_a2c_logstd network: name: actor_critic separate: False space: continuous: mu_activation: None sigma_activation: None mu_init: name: default sigma_init: name: const_initializer val: 0 fixed_sigma: True mlp: units: [32, 32] activation: elu initializer: name: default regularizer: name: None load_checkpoint: False load_path: config: name: Cartpole full_experiment_name: Cartpole device: cuda:0 device_name: cuda:0 env_name: rlgpu multi_gpu: False ppo: True mixed_precision: False normalize_input: True normalize_value: True num_actors: 512 reward_shaper: scale_value: 0.1 normalize_advantage: True gamma: 0.99 tau: 0.95 learning_rate: 0.0003 lr_schedule: adaptive kl_threshold: 0.008 score_to_win: 20000 max_epochs: 100 save_best_after: 50 save_frequency: 25 grad_norm: 1.0 entropy_coef: 0.0 truncate_grads: True e_clip: 0.2 horizon_length: 16 minibatch_size: 8192 mini_epochs: 8 critic_coef: 4 clip_value: True seq_length: 4 bounds_loss_coef: 0.0001 Setting seed: 42 Sim params does not have attribute: physx Sim params does not have attribute: Cartpole Pipeline: GPU Pipeline Device: cuda:0 Sim Device: GPU Task Device: cuda:0 RL device: cuda:0 2024-03-08 08:31:51 [9,758ms] [Warning] [omni.isaac.core.utils.viewports] omni.kit.viewport.utility needs to be enabled before using this function 2024-03-08 08:32:11 [29,790ms] [Warning] [omni.client.python] Detected a blocking function. This will cause hitches or hangs in the UI. Please switch to the async version: File "/my_dir/OmniIsaacGymEnvs/omniisaacgymenvs/scripts/rlgames_train.py", line 174, in File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 119, in run File "/isaac-sim/kit/python/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job File "/my_dir/OmniIsaacGymEnvs/omniisaacgymenvs/scripts/rlgames_train.py", line 145, in parse_hydra_configs File "/files/omniisaacgymenvs/omniisaacgymenvs/utils/task_util.py", line 103, in initialize_task File "/files/omniisaacgymenvs/omniisaacgymenvs/envs/vec_env_rlgames.py", line 47, in set_task File "/isaac-sim/exts/omni.isaac.gym/omni/isaac/gym/vec_env/vec_env_base.py", line 129, in set_task File "/isaac-sim/exts/omni.isaac.core/omni/isaac/core/world/world.py", line 401, in reset File "/files/omniisaacgymenvs/omniisaacgymenvs/tasks/cartpole.py", line 65, in set_up_scene File "/files/omniisaacgymenvs/omniisaacgymenvs/tasks/cartpole.py", line 83, in get_cartpole File "/files/omniisaacgymenvs/omniisaacgymenvs/robots/articulations/cartpole.py", line 54, in __init__ File "/isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/nucleus.py", line 572, in get_assets_root_path File "/isaac-sim/exts/omni.isaac.core/omni/isaac/core/utils/nucleus.py", line 192, in check_server File "/isaac-sim/kit/extscore/omni.client/omni/client/__init__.py", line 158, in stat 2024-03-08 08:34:07 [145,334ms] [Warning] [omni.physx.plugin] PhysX warning: GPU solver pipeline failed, switching to software, FILE /buildAgent/work/eb2f45c4acc808a0/physx/source/simulationcontroller/src/ScScene.cpp, LINE 819 2024-03-08 08:34:07 [145,334ms] [Warning] [omni.physx.plugin] PhysX warning: GPU Bp pipeline failed, switching to software, FILE /buildAgent/work/eb2f45c4acc808a0/physx/source/simulationcontroller/src/ScScene.cpp, LINE 827 /isaac-sim/kit/python/bin/python3: symbol lookup error: /isaac-sim/extsPhysics/omni.physx-105.1.12-5.1/bin/libomni.physx.plugin.so: undefined symbol: cuDeviceGetCount There was an error running python ```


Does it mean that the application was not linked correctly against the Cuda library? (more details) I'm happy to provide any more details that are needed.

Could you please suggest possible next steps to debug this?

VladimirFokow commented 1 month ago

Is this because Isaac Sim isn't supported on A100 at all? (even in headless mode)?: https://forums.developer.nvidia.com/t/isaac-sim-orbit-docker-on-hpc/269377