isaac-sim / IsaacLab

Unified framework for robot learning built on NVIDIA Isaac Sim
https://isaac-sim.github.io/IsaacLab
Other
1.84k stars 687 forks source link

[Bug Report] Program aborted the execution due to unhandled error when runing API Demos 1 and 2 #1

Closed Toni-SM closed 6 months ago

Toni-SM commented 1 year ago

Describe the bug

Demos 1 and 2 in API Demos abort the execution due to unhandled error

Steps to reproduce

Run demos 1 and 2 in API Demos

System Info

Describe the characteristic of your environment:

Additional context

Checklist

Mayankm96 commented 1 year ago

Just to understand what could be happening, could you also check if the issue happens when running Isaac Sim example on multiple frankas. It does a similar set of operations.

cd ${ISAACSIM_PATH}
./python.sh standalone_examples/api/omni.isaac.core/add_frankas.py
Toni-SM commented 1 year ago

There is no error running the Isaac Sim example

Screenshot from 2023-01-25 22-07-02

Demos 1 and 2 keep crashing

ADebor commented 1 year ago

Hi @Toni-SM, @Mayankm96 ,

I don't know if you figured out what is happening, but I just wanted to let you know I'm experiencing the exact same problem with demos 1 and 2, as well as with the teleoperation script located at source/standalone/environments/teleoperation/teleop_se3_agent.py, as you can see below.

image

Just like @Toni-SM, there is no error running the Isaac Sim example.

My environment: Commit: 7fe10e1 Isaac Sim Version: 2022.2.0 OS: Ubuntu 20.04 GPU: RTX 3060 CUDA: 12.0 GPU Driver: 525.85.05

Mayankm96 commented 1 year ago

Hi @ADebor ,

Thanks for bringing this up again. We weren't able to find the issue yet but from what I discussed with Toni, he was having this issue when running the scripts on a laptop (they worked fine on his desktop with a similar configuration).

For you, is this happening on a laptop setup as well? Do other scripts also not work (example: play_arms.py)?

ADebor commented 1 year ago

Hi @Mayankm96,

Thanks for your quick answer.

Yes, the problem does happen when running the scripts on my laptop (Thinkpad X1 Extreme Gen5). As for the scripts, I encounter the problem running the two first demos play_quadrupeds.py, play_arms.py, and the teleop one I mentioned in my previous comment. play_ik_control.py works though.

Mayankm96 commented 1 year ago

Hi @Toni-SM and @ADebor ,

Isaac Sim 2022.2.1 has been released recently. Can it be checked if the issue still persists with it?

Thanks a lot!

Toni-SM commented 1 year ago

I'm in.. Btw, I am getting the following error when ./orbit.sh --extra

toni@HP-ZBook-Studio-G8:~/isaacorbit/Orbit$ ./orbit.sh --extra
[INFO] Installing extra requirements such as learning frameworks...                                                                                                                                       
/home/toni/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/bin/python3: can't open file '/home/toni/isaacorbit/Orbit/source/extensions/omni.isaac.orbit_envs[all]': [Errno 2] No such file or directory
There was an error running python
Toni-SM commented 1 year ago

Hi @Mayankm96

Demo 1 and 2 still fail Demo 3 and 4 run fine And the following works nicely

cd ${ISAACSIM_PATH}
./python.sh standalone_examples/api/omni.isaac.core/add_frankas.py

demo2-kit_20230324_105000.log demo1-kit_20230324_104833.log

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8    18W /  N/A |     12MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1574      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      2484      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+
HadiBeyzaee commented 1 year ago

Hi @ADebor Did you find any solution for the Demos 1 and 2 and ./orbit.sh --extra? I have the same errors.

Mayankm96 commented 1 year ago

The issue with orbit.sh --extra should be fixed now #41 . For the demos, I have requested for a laptop with a GPU to see what could be the issue here.

Mayankm96 commented 1 year ago

@HadiBeyzaee9775 or others,

I was looking into the scripts and I think I found the possible issue. When the backend is set to torch, it seems that one has to specify the device to use with the backend. This can be seen in the logs @Toni-SM shared above. Over there the devices for all the views are None which is what could be causing the segmentation fault in the code.

I fixed the device to "cpu" on the branch fix/device-in-demos. Can you please try it out to see if that fixes the issue?

git remote upstream https://github.com/NVIDIA-Omniverse/Orbit.git
git fetch upstream fix/device-in-demos
git checkout fix/device-in-demos
zhuyijie88 commented 1 year ago

@Mayankm96 , it still has the same fault with branch [fix/device-in-demos].

My environment: Commit: b7263b0 Isaac Sim Version: 2022.2.1 OS: Ubuntu 20.04 GPU: RTX 3080 CUDA: 11.3 GPU Driver: 525.60.11

zhuyijie88 commented 1 year ago

Adding device='cuda' solves the problem. For example, in Line 91 of play_quadrupeds.py, change the code to:

sim = SimulationContext(stage_units_in_meters=1.0, physics_dt=0.005, rendering_dt=0.005, backend="torch", device='cuda')
ADebor commented 1 year ago

Hi there,

Sorry for the late response.

Toni-SM commented 1 year ago

Adding device='cuda' solves the problem.

This change works. However, robots do nothing!

Screenshot from 2023-03-31 11-24-04

ADebor commented 1 year ago

Hi @Toni-SM ,

Did you wait for several "Resetting robots state..."? The robots do move on my side, but only to reach a random pose once per reset. They do not move randomly at all steps.

Toni-SM commented 1 year ago

Hi @ADebor

I have run the example for quite some time but nothing. The robots do not go to a random pose at each reset nor do they open or close the gripper, as they do when I run the code on a workstation.

Mayankm96 commented 1 year ago

@Toni-SM, on the UI it says to enable scene graph instancing? Can you select that to enable it?

@ADebor let's make a separate issue for the keyboard. It could be that Kit has their own callbacks binded to those keys which are blocking the ones added to the keyboard control interface.

Toni-SM commented 1 year ago

Hi @Mayankm96

Enabling the scene graph instancing in UI does not do anything

Screenshot from 2023-03-31 13-36-23

kumar-sanjeeev commented 1 year ago

Hello @Toni-SM ,

are you able to figure it out why robots are not doing anything even after specifying device="cuda" and enable scene graph instancing.

I am also facing the same issue at my end. Althoughplay_cloner.py script from Tutorials[Core] section is working absolutely, fine. This script code is almost similar to the play_arms.py just added more robots in the scene.

My System Info
kumar-sanjeeev commented 1 year ago

Hello @Toni-SM and @Mayankm96,

Now I am able to run the robot using the source/standalone/demo/play_arms.py script by specifying device="cuda". I made the following changes in the source code, referring to source/standalone/demo/play_cloner.py.

Changes

from omni.isaac.core.utils.carb import set_carb_setting

def main():
    """Spawns a single arm manipulator and applies random joint commands."""

    # Load kit helper
    sim = SimulationContext(physics_dt=0.01, rendering_dt=0.01, backend="torch", device="cuda")
    # Set main camera
    set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])

    ############# changes
    # Enable fltacache which avoids passing data over to USD structure
    # this speed up the read-write operation of GPU buffers
    if sim.get_physics_context().use_gpu_pipeline:
        sim.get_physics_context().enable_flatcache(True)
    # Enable hydra scene-graph instancing
    # this is needed to visualize the scene when flatcache is enabled
    set_carb_setting(sim._settings, "/persistent/omnihydra/useSceneGraphInstancing", True)
    ###############

After these changes, I no longer need to enable scene graph instancing, but I still need to specify the device="cuda".

Mayankm96 commented 1 year ago

I finally got my hands on a machine locally where we could reproduce this issue. This is happening on systems that have an Intel graphics card along with an NVIDIA RTX card. If the monitor is configured to use the Intel card, then the CPU simulation is failing while the GPU simulation works fine.

At least for the PC where we saw this issue, we just plugged the monitor to the NVIDIA graphics card and it started working fine. On laptops, I suppose you can use prime-select.

Would be great to know if this fixes the issues.

Mayankm96 commented 6 months ago

Since the project has undergone large changes and no further response on this issue, I am closing it up.

I hope the fix above does solve the problem for you. If the issue persists with the latest main, please feel free to open the issue again.