[Bug Report] Program aborted the execution due to unhandled error when runing API Demos 1 and 2

Toni-SM commented 1 year ago

Describe the bug

Demos 1 and 2 in API Demos abort the execution due to unhandled error

Steps to reproduce

Run demos 1 and 2 in API Demos

Spawn different quadrupeds, visualize feet markers, and make robots stand using position commands:
```
./orbit.sh -p source/standalone/demo/play_quadrupeds.py
```
kit_20230118_201202.log
Spawn multiple Franka arms and apply random position commands:
```
./orbit.sh -p source/standalone/demo/play_arms.py --robot franka_panda
```
kit_20230118_201501.log

System Info

Describe the characteristic of your environment:

Commit: dd24040b115fbaa705a83fee4eac9afaa1481be0
Isaac Sim Version: 2022.2.0
OS: Ubuntu 20.04
GPU: RTX 3080
CUDA: 11.8
GPU Driver: 520.61.05

Additional context

Demo 3 runs well

Checklist

[x] I have checked that there is no similar issue in the repo (required)
[x] I have checked that the issue is not in running Isaac Sim itself and is related to the repo

Mayankm96 commented 1 year ago

Just to understand what could be happening, could you also check if the issue happens when running Isaac Sim example on multiple frankas. It does a similar set of operations.

cd ${ISAACSIM_PATH}
./python.sh standalone_examples/api/omni.isaac.core/add_frankas.py

Toni-SM commented 1 year ago

There is no error running the Isaac Sim example

Screenshot from 2023-01-25 22-07-02

Demos 1 and 2 keep crashing

ADebor commented 1 year ago

Hi @Toni-SM, @Mayankm96 ,

I don't know if you figured out what is happening, but I just wanted to let you know I'm experiencing the exact same problem with demos 1 and 2, as well as with the teleoperation script located at source/standalone/environments/teleoperation/teleop_se3_agent.py, as you can see below.

Just like @Toni-SM, there is no error running the Isaac Sim example.

My environment: Commit: 7fe10e1 Isaac Sim Version: 2022.2.0 OS: Ubuntu 20.04 GPU: RTX 3060 CUDA: 12.0 GPU Driver: 525.85.05

Mayankm96 commented 1 year ago

Hi @ADebor ,

Thanks for bringing this up again. We weren't able to find the issue yet but from what I discussed with Toni, he was having this issue when running the scripts on a laptop (they worked fine on his desktop with a similar configuration).

For you, is this happening on a laptop setup as well? Do other scripts also not work (example: play_arms.py)?

ADebor commented 1 year ago

Hi @Mayankm96,

Thanks for your quick answer.

Yes, the problem does happen when running the scripts on my laptop (Thinkpad X1 Extreme Gen5). As for the scripts, I encounter the problem running the two first demos play_quadrupeds.py, play_arms.py, and the teleop one I mentioned in my previous comment. play_ik_control.py works though.

Mayankm96 commented 1 year ago

Hi @Toni-SM and @ADebor ,

Isaac Sim 2022.2.1 has been released recently. Can it be checked if the issue still persists with it?

Thanks a lot!

Toni-SM commented 1 year ago

I'm in.. Btw, I am getting the following error when ./orbit.sh --extra

toni@HP-ZBook-Studio-G8:~/isaacorbit/Orbit$ ./orbit.sh --extra
[INFO] Installing extra requirements such as learning frameworks...                                                                                                                                       
/home/toni/.local/share/ov/pkg/isaac_sim-2022.2.1/kit/python/bin/python3: can't open file '/home/toni/isaacorbit/Orbit/source/extensions/omni.isaac.orbit_envs[all]': [Errno 2] No such file or directory
There was an error running python

Toni-SM commented 1 year ago

Hi @Mayankm96

Demo 1 and 2 still fail Demo 3 and 4 run fine And the following works nicely

cd ${ISAACSIM_PATH}
./python.sh standalone_examples/api/omni.isaac.core/add_frankas.py

demo2-kit_20230324_105000.log demo1-kit_20230324_104833.log

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8    18W /  N/A |     12MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1574      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      2484      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

HadiBeyzaee commented 1 year ago

Hi @ADebor Did you find any solution for the Demos 1 and 2 and ./orbit.sh --extra? I have the same errors.

Mayankm96 commented 1 year ago

The issue with orbit.sh --extra should be fixed now #41 . For the demos, I have requested for a laptop with a GPU to see what could be the issue here.

Mayankm96 commented 1 year ago

@HadiBeyzaee9775 or others,

I was looking into the scripts and I think I found the possible issue. When the backend is set to torch, it seems that one has to specify the device to use with the backend. This can be seen in the logs @Toni-SM shared above. Over there the devices for all the views are None which is what could be causing the segmentation fault in the code.

I fixed the device to "cpu" on the branch fix/device-in-demos. Can you please try it out to see if that fixes the issue?

git remote upstream https://github.com/NVIDIA-Omniverse/Orbit.git
git fetch upstream fix/device-in-demos
git checkout fix/device-in-demos

zhuyijie88 commented 1 year ago

@Mayankm96 , it still has the same fault with branch [fix/device-in-demos].

My environment: Commit: b7263b0 Isaac Sim Version: 2022.2.1 OS: Ubuntu 20.04 GPU: RTX 3080 CUDA: 11.3 GPU Driver: 525.60.11

zhuyijie88 commented 1 year ago

Adding device='cuda' solves the problem. For example, in Line 91 of play_quadrupeds.py, change the code to:

sim = SimulationContext(stage_units_in_meters=1.0, physics_dt=0.005, rendering_dt=0.005, backend="torch", device='cuda')

ADebor commented 1 year ago

Hi there,

Sorry for the late response.

Demos 1 and 2 work fine on my end with the new Isaac Sim release, using the same code as before. Demos 3 and 4 still work as well. I've just git pulled the repo's main branch and these 4 demos still work just fine. No need to add device='cuda' on my end.
orbit.sh --extra does not fail anymore, as expected.
The teleop example (source/standalone/environments/teleoperation/teleop_se3_agent.py) still fails though. When launched using the --cpu option or not, the script runs, but I'm not able to control the robot using the keyboard. Or maybe I don't use this example correctly: I simply launch it in the terminal and press keys on my keyboard, making them printed in this same terminal, but not moving the robot in the simulator. When pressing keys in the simulator, 'K' does toggle the gripper (which is expected), but pressing other keys simply selects different modes in the left panel of the simulator.

Toni-SM commented 1 year ago

Adding device='cuda' solves the problem.

This change works. However, robots do nothing!

Screenshot from 2023-03-31 11-24-04

ADebor commented 1 year ago

Hi @Toni-SM ,

Did you wait for several "Resetting robots state..."? The robots do move on my side, but only to reach a random pose once per reset. They do not move randomly at all steps.

Toni-SM commented 1 year ago

Hi @ADebor

I have run the example for quite some time but nothing. The robots do not go to a random pose at each reset nor do they open or close the gripper, as they do when I run the code on a workstation.

Mayankm96 commented 1 year ago

@Toni-SM, on the UI it says to enable scene graph instancing? Can you select that to enable it?

@ADebor let's make a separate issue for the keyboard. It could be that Kit has their own callbacks binded to those keys which are blocking the ones added to the keyboard control interface.

Toni-SM commented 1 year ago

Hi @Mayankm96

Enabling the scene graph instancing in UI does not do anything

Screenshot from 2023-03-31 13-36-23

kumar-sanjeeev commented 1 year ago

Hello @Toni-SM ,

are you able to figure it out why robots are not doing anything even after specifying device="cuda" and enable scene graph instancing.

I am also facing the same issue at my end. Althoughplay_cloner.py script from Tutorials[Core] section is working absolutely, fine. This script code is almost similar to the play_arms.py just added more robots in the scene.

My System Info

Isaac Sim Version: [2022.2.1]
OS: [ Ubuntu 20.04]
GPU: [ RTX A300]
CUDA: [11.7]
GPU Driver: [515.105.01]

kumar-sanjeeev commented 1 year ago

Hello @Toni-SM and @Mayankm96,

Now I am able to run the robot using the source/standalone/demo/play_arms.py script by specifying device="cuda". I made the following changes in the source code, referring to source/standalone/demo/play_cloner.py.

Changes

from omni.isaac.core.utils.carb import set_carb_setting

def main():
    """Spawns a single arm manipulator and applies random joint commands."""

    # Load kit helper
    sim = SimulationContext(physics_dt=0.01, rendering_dt=0.01, backend="torch", device="cuda")
    # Set main camera
    set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])

    ############# changes
    # Enable fltacache which avoids passing data over to USD structure
    # this speed up the read-write operation of GPU buffers
    if sim.get_physics_context().use_gpu_pipeline:
        sim.get_physics_context().enable_flatcache(True)
    # Enable hydra scene-graph instancing
    # this is needed to visualize the scene when flatcache is enabled
    set_carb_setting(sim._settings, "/persistent/omnihydra/useSceneGraphInstancing", True)
    ###############

After these changes, I no longer need to enable scene graph instancing, but I still need to specify the device="cuda".

Mayankm96 commented 1 year ago

I finally got my hands on a machine locally where we could reproduce this issue. This is happening on systems that have an Intel graphics card along with an NVIDIA RTX card. If the monitor is configured to use the Intel card, then the CPU simulation is failing while the GPU simulation works fine.

At least for the PC where we saw this issue, we just plugged the monitor to the NVIDIA graphics card and it started working fine. On laptops, I suppose you can use prime-select.

Would be great to know if this fixes the issues.

Mayankm96 commented 6 months ago

Since the project has undergone large changes and no further response on this issue, I am closing it up.

I hope the fix above does solve the problem for you. If the issue persists with the latest main, please feel free to open the issue again.

isaac-sim / IsaacLab