NVlabs / curobo

CUDA Accelerated Robot Library
https://curobo.org
Other
796 stars 125 forks source link

Error when running isaac sim docker #119

Closed BolunDai0216 closed 3 months ago

BolunDai0216 commented 10 months ago
  1. cuRobo installation mode docker isaac sim:
  2. python version: 3.10
  3. Isaac Sim version (if using): 2023.1.0

Issue Details

After following the installation steps for the Isaac Sim Curobo docker, I went inside to run the simple stacking example

omni_python simple_stacking.py

however, I get the error

MESA: warning: Driver does not support the 0xa788 PCI ID.
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
X Error of failed request:  GLXBadFBConfig
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  0 ()
  Serial number of failed request:  176
  Current serial number in output stream:  176

Any idea what might be the issue?

Thanks in advance!

balakumar-s commented 10 months ago

Does isaac sim work through the omniverse launcher?

BolunDai0216 commented 10 months ago

Isaac Sim does work on my host machine, and I can stream Isaac SIm via WebRTC using the Isaac Sim docker file: docker pull nvcr.io/nvidia/isaac-sim:2023.1.1.

balakumar-s commented 10 months ago

Are you able to run isaac sim docker with gui, not through headless?

BolunDai0216 commented 10 months ago

I assume using gui is not passing the --headless_mode argument. If that is the case, then no.

balakumar-s commented 10 months ago

How did you start the docker?

BolunDai0216 commented 10 months ago

I was able to start it by using this command

docker run --name isaac-sim --entrypoint bash -it --gpus all -e "ACCEPT_EULA=Y" --rm --network=host \
    -e "PRIVACY_CONSENT=Y" \
    -v ~/docker/isaac-sim/cache/kit:/isaac-sim/kit/cache:rw \
    -v ~/docker/isaac-sim/cache/ov:/root/.cache/ov:rw \
    -v ~/docker/isaac-sim/cache/pip:/root/.cache/pip:rw \
    -v ~/docker/isaac-sim/cache/glcache:/root/.cache/nvidia/GLCache:rw \
    -v ~/docker/isaac-sim/cache/computecache:/root/.nv/ComputeCache:rw \
    -v ~/docker/isaac-sim/logs:/root/.nvidia-omniverse/logs:rw \
    -v ~/docker/isaac-sim/data:/root/.local/share/ov/data:rw \
    -v ~/docker/isaac-sim/documents:/root/Documents:rw \
    curobo_docker:isaac_sim_2023.1.0

that I copied from here. But still not able to run it using any of the provided shell scripts.

BolunDai0216 commented 10 months ago

@balakumar-s Sorry, I misunderstood your question, for the error I am referring to in this issue, I started the docker using the command

bash start_dev_docker.sh isaac_sim_2023.1.0

However, I was able to run things in headless mode if I use the command provided in the above comment.

balakumar-s commented 10 months ago

To use dev docker, you need to build a dev docker with build_dev_docker.sh

I think you didn't build a dev docker, so try start_docker.sh isaac_sim_2023.1.0

BolunDai0216 commented 10 months ago

I did build the dev docker, I will try the command you sent.

BolunDai0216 commented 10 months ago

The command you sent does not have this driver issue, but somehow the websocket is not working.

Here is the output of rebuilding the dev docker

bash build_dev_docker.sh isaac_sim_2023.1.0
isaac_sim_2023.1.0
1000
[+] Building 0.2s (26/26) FINISHED                                          docker:default
 => [internal] load build definition from user_isaac_sim.dockerfile                   0.0s
 => => transferring dockerfile: 2.84kB                                                0.0s
 => [internal] load .dockerignore                                                     0.0s
 => => transferring context: 60B                                                      0.0s
 => [internal] load metadata for docker.io/library/curobo_docker:isaac_sim_2023.1.0   0.0s
 => [ 1/22] FROM docker.io/library/curobo_docker:isaac_sim_2023.1.0                   0.0s
 => CACHED [ 2/22] RUN useradd -l -u 1000 -g users bolun                              0.0s
 => CACHED [ 3/22] RUN /sbin/adduser bolun sudo                                       0.0s
 => CACHED [ 4/22] RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers            0.0s
 => CACHED [ 5/22] RUN usermod -aG root bolun                                         0.0s
 => CACHED [ 6/22] RUN mkdir /isaac-sim/kit/cache && chown -R bolun:users /isaac-sim  0.0s
 => CACHED [ 7/22] RUN chown bolun:users /root && chown bolun:users /isaac-sim        0.0s
 => CACHED [ 8/22] RUN mkdir /root/.nv && chown -R bolun:users /root/.nv              0.0s
 => CACHED [ 9/22] RUN chown -R bolun:users /root/.cache                              0.0s
 => CACHED [10/22] RUN mkdir -p /isaac-sim/kit/logs/Kit/Isaac-Sim && chown -R bolun:  0.0s
 => CACHED [11/22] RUN mkdir /root/.nvidia-omniverse/logs && mkdir -p /home/bolun/.n  0.0s
 => CACHED [12/22] RUN chown -R bolun:users /isaac-sim/exts/omni.isaac.synthetic_rec  0.0s
 => CACHED [13/22] RUN chown -R bolun:users /isaac-sim/kit/exts/omni.gpu_foundation   0.0s
 => CACHED [14/22] RUN mkdir -p /home/bolun/.cache && cp -r /root/.cache/* /home/bol  0.0s
 => CACHED [15/22] RUN mkdir -p /isaac-sim/kit/data/documents/Kit && mkdir -p /isaac  0.0s
 => CACHED [16/22] RUN mkdir -p /home/bolun/.local                                    0.0s
 => CACHED [17/22] RUN echo "alias omni_python='/isaac-sim/python.sh'" >> /home/bolu  0.0s
 => CACHED [18/22] RUN echo "alias python='/isaac-sim/python.sh'" >> /home/bolun/.ba  0.0s
 => CACHED [19/22] RUN chown -R bolun:users /home/bolun                               0.0s
 => CACHED [20/22] WORKDIR /home/bolun                                                0.0s
 => CACHED [21/22] RUN mkdir /root/Documents && chown -R bolun:users /root/Documents  0.0s
 => CACHED [22/22] RUN echo 'completed'                                               0.0s
 => exporting to image                                                                0.0s
 => => exporting layers                                                               0.0s
 => => writing image sha256:e937a47958f62bea40ecb30080ad7b31016bac2a0141a2ee4d1df919  0.0s
 => => naming to docker.io/library/curobo_docker:user_isaac_sim_2023.1.0              0.0s
❯ bash start_dev_docker.sh isaac_sim_2023.1.0
Isaac Sim Dev Docker is not supported
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
balakumar-s commented 10 months ago

Let's try to get the non dev docker working first.

  1. Run xhost + in a terminal outside docker.
  2. Start a docker with start_docker.sh isaac_sim_2023.1.0
  3. And then inside the docker, run ./runapp.sh
BolunDai0216 commented 10 months ago

I'll try that later tonight. Thanks for the help!

BolunDai0216 commented 10 months ago

2. start_docker.sh isaac_sim_2023.1.0

@balakumar-s I get the following output

./runapp.sh 

The NVIDIA Omniverse License Agreement (EULA) must be accepted before
Omniverse Kit can start. The license terms for this product can be viewed at
https://developer.nvidia.com/omniverse/license

Omniverse Software collects installation and configuration details about your software, hardware, and network
configuration (e.g., version of operating system, applications installed, type of hardware, network speed, IP
address) based on our legitimate interest in improving your experience. To improve performance, troubleshooting
and diagnostic purposes of our software, we also collect session behavior, error and crash logs.

Data Collection in container mode is completely anonymous unless specified. You may opt-out of this collection
anytime by not setting the PRIVACY_CONSENT environment variable.

To opt-in set the PRIVACY_CONSENT environment variable when running the container. Set the 
PRIVACY_USERID environment variable tag the telemetry data with a user ID or email.
Loading user config located at: '/root/.local/share/ov/data/Kit/Isaac-Sim/2023.1/user.config.json'
[Info] [carb] Logging to file: /root/.nvidia-omniverse/logs/Kit/Isaac-Sim/2023.1/kit_20240120_180706.log
2024-01-21 02:07:06 [0ms] [Warning] [carb.crashreporter-breakpad.plugin] [previous crash] preventing upload of minidump due to user opt-out: '/root/.local/share/ov/data/Kit/Isaac-Sim/2023.1/84d47d30-7e6d-4ce2-f5a52d92-6b1b012e.dmp.zip'
2024-01-21 02:07:06 [2ms] [Warning] [omni.ext.plugin] [ext: omni.kit.converter.cad_core-200.0.0-rc.3+105.0.lx64.r.cp310] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [2ms] [Warning] [omni.ext.plugin] [ext: omni.kit.converter.cad-200.0.0-rc.4+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [3ms] [Warning] [omni.ext.plugin] [ext: omni.kit.sequencer.core-103.4.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [3ms] [Warning] [omni.ext.plugin] [ext: omni.kit.sequencer.usd-103.4.2+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [3ms] [Warning] [omni.ext.plugin] [ext: omni.kit.widget.timeline-105.0.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [3ms] [Warning] [omni.ext.plugin] [ext: omni.kit.window.sequencer-103.4.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [3ms] [Warning] [omni.ext.plugin] [ext: omni.paint.brush.attributes-1.3.1+105.0] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
2024-01-21 02:07:06 [4ms] [Warning] [omni.ext.plugin] [ext: omni.usd.schema.sequence-2.3.0+105.0.lx64.r.cp310] Built using kit version: 105.0. Current version: 105.1. It is considered compatible, but building with a newer version is recommended.
[0.107s] [ext: omni.kit.async_engine-0.0.0] startup
[0.371s] [ext: omni.assets.plugins-0.0.0] startup
[0.372s] [ext: omni.datastore-0.0.0] startup
[0.372s] [ext: omni.client-1.0.1] startup
[0.379s] [ext: omni.taskagent-0.0.0] startup
[0.379s] [ext: omni.stats-0.0.0] startup
[0.380s] [ext: omni.activity.core-1.0.1] startup
[0.381s] [ext: omni.ujitso-0.0.0] startup
[0.382s] [ext: omni.hsscclient-0.0.0] startup
[0.382s] [ext: omni.activity.profiler-1.0.2] startup
[0.385s] [ext: omni.gpu_foundation-0.0.0] startup
[0.391s] [ext: omni.rtx.shadercache.vulkan-1.0.0] startup
[0.392s] [ext: carb.windowing.plugins-1.0.0] startup
2024-01-21 02:07:07 [841ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  ELAN06FA:00 04F3:327E Touchpad [18000000f30400007e32000000010000]
2024-01-21 02:07:07 [841ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  MX MCHNCL M Keyboard [050000006d04000067b3000010000000]
2024-01-21 02:07:07 [841ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  ITE Tech. Inc. ITE Device(8258) Keyboard [030000008d04000088c9000010010000]
[0.863s] [ext: omni.kit.renderer.init-0.0.0] startup
MESA: warning: Driver does not support the 0xa788 PCI ID.
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
X Error of failed request:  GLXBadFBConfig
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  0 ()
  Serial number of failed request:  211
  Current serial number in output stream:  211
2024-01-21 02:07:07 [929ms] [Warning] [carb] [Plugin: carb.taskagent.plugin] Module /isaac-sim/kit/exts/omni.taskagent/bin/deps/libcarb.taskagent.plugin.so remained loaded after unload request
2024-01-21 02:07:07 [930ms] [Warning] [omni.core.ITypeFactory] Module /isaac-sim/kit/exts/omni.activity.core/bin/libomni.activity.core.plugin.so remained loaded after unload request.

I see the same error.

balakumar-s commented 10 months ago

I'm not sure what's happening. You can try headless for the examples by adding --headless_mode native as an argument.

BolunDai0216 commented 10 months ago

@balakumar-s when running

/isaac-sim/python.sh simple_stacking.py --headless_mode native

I get a similar error

Starting kit application with the following args:  ['/isaac-sim/exts/omni.isaac.kit/omni/isaac/kit/simulation_app.py', '/isaac-sim/apps/omni.isaac.sim.python.kit', '--/app/tokens/exe-path=/isaac-sim/kit', '--/persistent/app/viewport/displayOptions=3094', '--/rtx/materialDb/syncLoads=True', '--/rtx/hydra/materialSyncLoads=True', '--/omni.kit.plugin/syncUsdLoads=True', '--/app/renderer/resolution/width=1920', '--/app/renderer/resolution/height=1080', '--/app/window/width=1440', '--/app/window/height=900', '--/renderer/multiGpu/enabled=True', '--/app/fastShutdown=True', '--ext-folder', '/isaac-sim/exts', '--ext-folder', '/isaac-sim/apps', '--/physics/cudaDevice=0', '--portable', '--no-window', '--allow-root']
Passing the following args to the base kit application:  ['--headless_mode', 'native']
[Info] [carb] Logging to file: /isaac-sim/kit/logs/Kit/Isaac-Sim/2023.1/kit_20240120_190235.log
2024-01-21 03:02:35 [0ms] [Warning] [omni.kit.app.plugin] No crash reporter present, dumps uploading isn't available.
[0.029s] [ext: omni.kit.async_engine-0.0.0] startup
[0.248s] [ext: omni.activity.core-1.0.1] startup
[0.250s] [ext: omni.assets.plugins-0.0.0] startup
[0.251s] [ext: omni.stats-0.0.0] startup
[0.251s] [ext: omni.client-1.0.1] startup
[0.258s] [ext: omni.activity.profiler-1.0.2] startup
[0.260s] [ext: omni.gpu_foundation-0.0.0] startup
[0.265s] [ext: omni.rtx.shadercache.vulkan-1.0.0] startup
[0.266s] [ext: carb.windowing.plugins-1.0.0] startup
2024-01-21 03:02:36 [740ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  ELAN06FA:00 04F3:327E Touchpad [18000000f30400007e32000000010000]
2024-01-21 03:02:36 [741ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  MX MCHNCL M Keyboard [050000006d04000067b3000010000000]
2024-01-21 03:02:36 [741ms] [Warning] [carb.windowing-glfw.gamepad] Joystick with unknown remapping detected (will be ignored):  ITE Tech. Inc. ITE Device(8258) Keyboard [030000008d04000088c9000010010000]
[0.749s] [ext: omni.kit.renderer.init-0.0.0] startup
MESA: warning: Driver does not support the 0xa788 PCI ID.
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
X Error of failed request:  GLXBadFBConfig
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  0 ()
  Serial number of failed request:  176
  Current serial number in output stream:  176
2024-01-21 03:02:36 [844ms] [Warning] [omni.core.ITypeFactory] Module /isaac-sim/kit/exts/omni.activity.core/bin/libomni.activity.core.plugin.so remained loaded after unload request.
There was an error running python

For some reason omni_python does not work, do you think that might indicate there is something wrong when building the docker image?

BolunDai0216 commented 10 months ago

Should the Isaac Sim container show something like this:

NVIDIA Release 23.08 (build 66128610)
PyTorch Version 2.1.0a0+29c30b1

Container image Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2014-2023 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

when I start it? It not showing anything right now.

balakumar-s commented 10 months ago

Does headless work in this docker?

Try running: ./runheadless.native.sh -v

BolunDai0216 commented 10 months ago

@balakumar-s It does not. It gives the same error

ESA: warning: Driver does not support the 0xa788 PCI ID.
2024-01-21 03:51:08 [1,402ms] [Info] [gpu.foundation.plugin] Skipping createScoringSurface for no-window request.
MESA: warning: Driver does not support the 0xa788 PCI ID.
libGL error: failed to create dri screen
libGL error: failed to load driver: iris
2024-01-21 03:51:08 [1,428ms] [Info] [carb.launcher.plugin] [parent]: successfully launched the child process 73.
2024-01-21 03:51:08 [1,428ms] [Info] [carb.launcher.plugin] [parent]: launched the child process 73 from the parent 29.
MESA: warning: Driver does not support the 0xa788 PCI ID.2024-01-21 03:51:08 [1,428ms] [Info] [carb.launcher.plugin] [parent]: successfully launched the child process 73 {process = 0x7fc19402a420}

libGL error: failed to create dri screen
libGL error: failed to load driver: iris
2024-01-21 03:51:08 [1,429ms] [Info] [carb.launcher.plugin] starting the read thread for 'launcher stdout reader' for the child process 73.
2024-01-21 03:51:08 [1,429ms] [Info] [carb.launcher.plugin] starting the read thread for 'launcher stderr reader' for the child process 73.
X Error of failed request:  GLXBadFBConfig
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  0 ()
  Serial number of failed request:  176
  Current serial number in output stream:  176
BolunDai0216 commented 10 months ago

I get the same error even after rebuilding the docker images.

balakumar-s commented 10 months ago

Can you try with docker started with start_docker.sh isaac_sim_2023.1.0 ?

BolunDai0216 commented 10 months ago

I get the same error.

balakumar-s commented 10 months ago

Can you paste here the output of nvidia-smi?

BolunDai0216 commented 10 months ago

The output of nvidia-smi is

nvidia-smi
Sun Jan 21 04:39:00 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4080 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   40C    P8               2W /  55W |    497MiB / 12282MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Yucheng-Tang commented 3 months ago

Hello, I had a similar problem with the current curobo version, did you find a way to solve this problem?

I had no problem with the official isaac sim docker, but when I opened the isaac sim 4.0.0 docker from curobo I encountered this problem, the container can run other GUIs, but only isaac sim reports this error...

MESA: warning: Driver does not support the 0xa788 PCI ID. libGL error: failed to create dri screen libGL error: failed to load driver: iris

Yucheng-Tang commented 3 months ago

Hi, I just found out why this is happening. Thanks for the reply!

I think it's because the mesa package doesn't automatically recognize mobile GPUs, so the OpenGL version in the container is 3.1 on my laptop (Omniverse requires OpenGL>=4.6). So I manually set the environment variable MESA_LOADER_DRIVER_OVERRIDE=4.6 and the problem disappeared.

I also tried it on a workstation with the same version of the GPU driver. Without this problem, the OpenGL version after the build will be 4.6.

On Fri, 9 Aug 2024 at 17:51, Bolun Dai @.***> wrote:

Hello, I had a similar problem with the current curobo version, did you find a way to solve this problem?

I had no problem with the official isaac sim docker, but when I opened the isaac sim 4.0.0 docker from curobo I encountered this problem, the container can run other GUIs, but only isaac sim reports this error...

MESA: warning: Driver does not support the 0xa788 PCI ID. libGL error: failed to create dri screen libGL error: failed to load driver: iris

@Yucheng-Tang https://github.com/Yucheng-Tang I recall I did resolve the issue, but I don't remember what I did...

Did you try installing the mesa related packages?

— Reply to this email directly, view it on GitHub https://github.com/NVlabs/curobo/issues/119#issuecomment-2278252838, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR6U62VRPOUN62JGOHGHXO3ZQTQQDAVCNFSM6AAAAABCDMYXEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZYGI2TEOBTHA . You are receiving this because you were mentioned.Message ID: @.***>

BolunDai0216 commented 3 months ago

@Yucheng-Tang Thanks for also sharing the solution!

balakumar-s commented 3 months ago

Thanks @Yucheng-Tang for finding this solution.