qcr / benchbot

BenchBot is a tool for seamlessly testing & evaluating semantic scene understanding tools in both realistic 3D simulation & on real robots
BSD 3-Clause "New" or "Revised" License
110 stars 12 forks source link

install error #95

Closed wanderHZ closed 1 year ago

wanderHZ commented 1 year ago

Hello, I am running the benchbot_ Install encountered an error like this

3b049874ea64ac70e32cb18b5746c2b

After my preliminary judgment, the error originated from this command (the error disappeared when I commented out this line of content). Do you know what the reason is or how to solve it?

5772ff8ce4b524c78c407df556d5d85

Thank you for your timely reply!

david2611 commented 1 year ago

The line you have highlighted is simply one used for starting the Isaac Sim simulator. We run it during installation to do some startup operations that need to be done once and then never again so as to speed up later instances of the simulator.

Can I check whether you fully completed the installation process with this line commented out? Were you able to run the simulator without that line in your installation?

The problem appears to be with Omniverse running on your system at all as that line is not doing anything "special".

Did you have drivers pre-installed on your system that met the minimum requirements or did you get them installed by the BenchBot installation process?

wanderHZ commented 1 year ago

My installation environment is Ubuntu 18.04, NVIDIA RTX 3090, and the previous version of Benchbot (V2.4.4). All drivers are installed directly through the benchbot. After I commented out this line, the installation process can be completed normally, but the emulator cannot be run. The following error (image) will appear, and all log output will be after the image.

1681176096208

################################################################################
################# CHECKING FOR BENCHBOT SOFTWARE STACK UPDATES #################
################################################################################

Skipping ...

################################################################################
###################### CLEANING UP ALL BENCHBOT REMNANTS #######################
################################################################################

Deleted the following containers:
Deleted Containers:
6d9d7baae97483805816bef0514f9e5779396a96e0382966b21660858f49c6c9

Deleted Networks:
benchbot_network

Deleted Images:
deleted: sha256:486a44f6af19f162d8fc69af8783bbe13c130d5210b8b552df1692b85a01cf05
deleted: sha256:a439ede6be6c8a5f73841b26c31343cdc997a191d35797001cf371caed6fd107
deleted: sha256:0175d8e55ca57f744c2c2d0552f7e7cf1babfdd8b1cabf5da7225cd951d894b6
deleted: sha256:8a7c03799b30c05b782cdb7665e091d9ff0b1c9a8975f6ffdde1579f16f66ad6
deleted: sha256:cffa42f0756978d6e731100d10b10f1d54d8ca32afa8b2dfc995ad4cadcf6126
deleted: sha256:befd893659eb656bed7fa7c5cb806aca48b672c18065b4388333aabef173b0d4
deleted: sha256:0aba7c0a963e673ec66ab3519f63d282111ec69f08e2a0c65eea0b82d3e8a1d2

Deleted build cache objects:
qbndlpi0deyctkt00ue24s1yk
hp3jqmarj73l9pbn2b328zbcw

Total reclaimed space: 11.79MB

Finished cleaning! (use 'benchbot_run -k' for a full clean)

################################################################################
##################### STARTING THE BENCHBOT SOFTWARE STACK #####################
################################################################################

Running the BenchBot system with the following settings:

    Selected task:        semantic_slam:passive:ground_truth
    Task results format:  object_map
    Selected robot:       carter_omni
    Selected environment: miniroom:1
    Scene/s:              miniroom:1, starting @ pose [0.7131, 0.0028, 0.0028, -0.701, 1.2, 1.5, 0.3]
                          (map_path = '.sim_data/miniroom_1.usd')
    Simulator required:   Yes (sim_omni)

Creating shared network 'benchbot_network':
c5880f6650445b83b278d0db6163336c5178842d013c6a130142567eb9a41f16

Starting persistent container for ROS core:
4262615866f91d013b10b4f3c0d21f51f9f786e139a7f835a691f02527dfd94e

Starting persistent container for BenchBot Robot Controller (sim_omni):
ef986054b4e0068fca47ab9a5223c7bcfaa82f106ae37869784be58f260eee76

Starting container for BenchBot Supervisor:
cb3c20bf616b78903af23f6b48d8b825261c24d6fbe626c782005122a10708ec

Starting container for BenchBot Debugging:
91b0eea59c2118519213669964074918342ae2b23513cffc8b3199980cbb398b

################################################################################
################### BENCHBOT IS RUNNING (Ctrl^C to exit) ... ###################
################################################################################

Initialising supervisor...

Configuring the supervisor...
Starting a supervisor with the following configuration:

{'environments': [{'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/envs_bear_develop_sim_omni/environments/miniroom_1.yaml',
                   'description': 'Mini room environment with all of the base '
                                  'objects in their normal place.\n'
                                  'All of the other mini room environments are '
                                  'based off this with any combination\n'
                                  'of objects, object positions, & lighting '
                                  'changed\n',
                   'map_path': '.sim_data/miniroom_1.usd',
                   'name': 'miniroom',
                   'object_labels': [...],
                   'robots': [...],
                   'start_pose': [...],
                   'trajectory_poses': [...],
                   'type': 'sim_omni',
                   'variant': 1}],
 'results': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/formats_object_map/formats/object_map.yaml',
             'description': 'The "object map" is a map of the objects in an '
                            'environment. Each object is\n'
                            'represented by a probability distribution '
                            'describing the suggested object\n'
                            "label, and the bounding box's 3D centroid and "
                            'extent. An "object map" also\n'
                            'requires a "class_list", which is used to create '
                            'object label probability\n'
                            'distributions.\n',
             'functions': {'create': 'object_map.create_empty',
                           'create_object': 'object_map.create_empty_object',
                           'validate': 'object_map.validate'},
             'name': 'object_map'},
 'robot': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/robots_sim_omni/robots/carter_omni.yaml',
           'address': 'http://benchbot_robot:10000',
           'connections': {'image_depth': {...},
                           'image_depth_info': {...},
                           'image_rgb': {...},
                           'image_rgb_info': {...},
                           'laser': {...},
                           'move_angle': {...},
                           'move_distance': {...},
                           'move_next': {...},
                           'poses': {...}},
           'global_frame': 'map',
           'name': 'carter_omni',
           'persistent_cmds': ['/benchbot/benchbot_simulator/run -P 10001 & '
                               'x=$! && sleep 10 && curl -X POST '
                               'http://localhost:10001/start && wait $x\n',
                               'rosrun benchbot_robot_controller noisify_odom '
                               '\\\n'
                               '  noise_linear:=0.2 noise_angular:=0.1\n'],
           'persistent_status': 'curl -s localhost:10001/started | grep -q '
                                "'true'\n",
           'poses': ['base_link', 'initial_pose', 'camera_left', 'lidar'],
           'robot_frame': 'base_link',
           'run_cmd': 'rostopic pub -1 /odom_start_pose std_msgs/String "data: '
                      '\'$START_POSE\'" && curl -s -o /dev/null '
                      'localhost:10001/open_environment \\\n'
                      '  -H "Content-Type: application/json" \\\n'
                      '  -d \'{"environment": "$ENVS_PATH/$MAP_PATH"}\' &&\n'
                      'curl -s -o /dev/null localhost:10001/place_robot \\\n'
                      '  -H "Content-Type: application/json" \\\n'
                      '  -d \'{"robot": "$ROBOT_PATH/.robot_data/carter.usd", '
                      '"start_pose": "$START_POSE"}\'\n',
           'stop_cmd': 'curl -s -o /dev/null -X POST '
                       'localhost:10001/stop_sim\n',
           'type': 'sim_omni'},
 'task': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/tasks_ssu/tasks/sslam_pgt.yaml',
          'actions': ['move_next'],
          'description': 'Use a Semantic SLAM algorithm to construct an object '
                         'map of the environment. An object map describes each '
                         'object with a probabilistic label suggestion, '
                         'spatial location, and optional probabilistic state '
                         'change suggestion. This task provides passive robot '
                         'control, and sensor observations with ground truth '
                         'robot pose.\n',
          'localisation': 'ground_truth',
          'name': 'semantic_slam:passive:ground_truth',
          'observations': ['image_depth',
                           'image_depth_info',
                           'image_rgb',
                           'image_rgb_info',
                           'laser',
                           'poses'],
          'results_format': 'object_map',
          'scene_count': 1,
          'type': 'sim_unreal'}}

Supervisor is now available @ 'http://0.0.0.0:10000' ...

Waiting until a robot controller is found @ 'http://benchbot_robot:10000' ... 

################################################################################
####################### BENCHBOT ROBOT CONTROLLER ERROR ########################
################################################################################

ERROR: The BenchBot Robot Controller container has exited unexpectedly. This 
should not happen under normal operating conditions. Please see the complete
log below for a dump of the crash output:

Requirement already satisfied: pip in /usr/local/lib/python3.6/dist-packages (21.3.1)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

################################################################################
###################### CLEANING UP ALL BENCHBOT REMNANTS #######################
################################################################################

Stopped the following containers:
91b0eea59c21
cb3c20bf616b

Deleted the following containers:
Deleted Containers:
91b0eea59c2118519213669964074918342ae2b23513cffc8b3199980cbb398b
cb3c20bf616b78903af23f6b48d8b825261c24d6fbe626c782005122a10708ec
ef986054b4e0068fca47ab9a5223c7bcfaa82f106ae37869784be58f260eee76

Total reclaimed space: 340kB

Finished cleaning! (use 'benchbot_run -k' for a full clean)
david2611 commented 1 year ago

Ah I see. As a heads up, using BenchBot is going to be incredibly tricky for you by trying to install the previous version from scratch.

You will need to "set back" all other attached github repos to the version that worked correctly with BenchBot 2.4.4 (before the omniverse update for each of them) and manually find the correct versions of all benchbot addons.

However, even with the old version, there shouldn't have been an issue at the line you had highlighted.

Can I please get information on your current NVIDIA driver? It could be that you have too advanced a driver for the old version of BenchBot. I am assuming that you are restricted to an 18.04 OS and cannot perform an upgrade which is why you are needing the old version of BenchBot for your research?

btalb commented 1 year ago

I'll just add here @wanderHZ , those types of errors usually occur when you're running GUI programs on a remote server and there's issues in your hardware-accelerated rendering setup.

Can you provide use with some details of how you're running this (e.g. remotely, sitting right next to the machine, through SSH, using window forwarding, etc.)?

wanderHZ commented 1 year ago

Thank you for your reply. During this period, I have reinstalled 18.04 to 20.04 and still installed all the drivers and environments from benchbot_install.sh. But I still got a similar error as follows. 1681379517402

Additionally, I am connected through a software(e.g. TeamViewer) that remotely controls the desktop, which means I don't have a display peripheral.

And will the OpenGL version have an impact? My openGL version is 3.1 Mesa 21.2.6. The other environmental conditions are as follows.

################################################################################
######################## PART 1: EXAMINING SYSTEM STATE ########################
################################################################################

Core host system checks:
    Ubuntu version >= 20.04:                                  Passed (20.04)

Running Nvidia related system checks:
    NVIDIA GPU available:                     Found card of type '10de:2204'
    NVIDIA driver is running:                                          Found
    NVIDIA driver version valid:                           Valid (530.30.02)
    NVIDIA driver from a standard PPA:                          PPA is valid
    CUDA drivers installed:                                    Drivers found
    CUDA drivers version valid:                          Valid (530.30.02-1)
    CUDA drivers from the NVIDIA PPA:                           PPA is valid
    CUDA is installed:                                            CUDA found
    CUDA version valid:                                         Valid (12.1)
    CUDA is from the NVIDIA PPA:                                PPA is valid

Running Docker related system checks:
    Docker is available:                                               Found
    Docker version valid:                                     Valid (23.0.3)
    NVIDIA Container Toolkit installed:                       Found (1.13.0)
    Docker runs without root:                                         Passed

Running checks of filesystem used for Docker:
    /var/lib/docker on ext4 filesystem:                      Yes (/dev/sda2)
    /var/lib/docker supports suid:                                   Enabled
    /var/lib/docker drive space check:               Sufficient space (788G)

Miscellaneous requirements:
    Pip python package manager available:                     Found (23.0.1)
    Tkinter for Python installed:                                      Found
    PIL (with ImageTk) for Python install                              Found

Manual installation steps for Omniverse-powered Isaac Sim:
    License accepted for Omniverse:                                      Yes
    Access to nvcr.io Docker registry:                                   Yes
btalb commented 1 year ago

Thanks @wanderHZ .

Given that setup, I suspect that TeamViewer is generating a non-hardware rendered virtual display. This is something Omniverse won't work with it (it requires hardware, i.e. NVIDIA GPU, powered rendering).

To confirm, can you run the commands in this comment for me please and post the output.

(my suspicions that's the issue are coming from the similar error received by the user in that thread)

wanderHZ commented 1 year ago

I successfully solved this problem by adding an environment variable when building the container. The instructions are as follows:

ENV MESA_GL_VERSION_OVERRIDE=4.6

The related solution link is https://blog.csdn.net/hongbinlin_ben/article/details/121726309

Thank you for responding to my question.

david2611 commented 1 year ago

Awesome work! Glad to hear you sorted out the problem for yourself and anyone who wants to use such hardware in future :+1: Will close the issue now