qcr / benchbot

BenchBot is a tool for seamlessly testing & evaluating semantic scene understanding tools in both realistic 3D simulation & on real robots
BSD 3-Clause "New" or "Revised" License
110 stars 12 forks source link

ERROR: An exception occurred while executing "benchbot_run" #81

Closed ffgao closed 1 year ago

ffgao commented 2 years ago

Hi,When I installed the program and executed "benchbot_run" command. The following error occurred.

~$ benchbot_run --robot carter_omni --env miniroom:1 --task semantic_slam:passive:ground_truth

################################################################################
################# CHECKING FOR BENCHBOT SOFTWARE STACK UPDATES #################
################################################################################

Checking BenchBot version ...               Up-to-date.
Checking BenchBot API version ...           Up-to-date.
Checking BenchBot Eval version ...          Up-to-date.
Checking BenchBot Supervisor version ...        Up-to-date.
Checking installed BenchBot add-ons are up-to-date ...  Up-to-date.

BenchBot is up-to-date.

################################################################################
###################### CLEANING UP ALL BENCHBOT REMNANTS #######################
################################################################################

Sending stop request to running controller:
{"stop_success":true}

Deleted the following containers:
Total reclaimed space: 0B

Finished cleaning! (use 'benchbot_run -k' for a full clean)

################################################################################
##################### STARTING THE BENCHBOT SOFTWARE STACK #####################
################################################################################

Running the BenchBot system with the following settings:

    Selected task:        semantic_slam:passive:ground_truth
    Task results format:  object_map
    Selected robot:       carter_omni
    Selected environment: miniroom:1
    Scene/s:              miniroom:1, starting @ pose [0.7, 0, 0, -0.7, 1.2, 1.5, 0.3]
                          (map_path = '.sim_data/miniroom_1.usd')
    Simulator required:   Yes (sim_omni)

Creating shared network 'benchbot_network':

Starting persistent container for ROS core:
Skipping (already running)

Starting persistent container for BenchBot Robot Controller (sim_omni):
Skipping (already running)

Starting container for BenchBot Supervisor:
7f1589564ca18af1020a65f1e8e92058aeb3f1d8713aa86eddefaa117818ded4

Starting container for BenchBot Debugging:
d1e6f2eb17d5aa82d63936cf39747199fedc66936cd17844f777473c0eafb263

################################################################################
################### BENCHBOT IS RUNNING (Ctrl^C to exit) ... ###################
################################################################################

Initialising supervisor...

Configuring the supervisor...
Starting a supervisor with the following configuration:

{'environments': [{'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/envs_bear_develop_sim_omni/environments/miniroom_1.yaml',
                   'description': 'Mini room environment with all of the base '
                                  'objects in their normal place.\n'
                                  'All of the other mini room environments are '
                                  'based off this with any combination\n'
                                  'of objects, object positions, & lighting '
                                  'changed\n',
                   'map_path': '.sim_data/miniroom_1.usd',
                   'name': 'miniroom',
                   'object_labels': [...],
                   'robots': [...],
                   'start_pose': [...],
                   'trajectory_poses': [...],
                   'type': 'sim_omni',
                   'variant': 1}],
 'results': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/formats_object_map/formats/object_map.yaml',
             'description': 'The "object map" is a map of the objects in an '
                            'environment. Each object is\n'
                            'represented by a probability distribution '
                            'describing the suggested object\n'
                            "label, and the bounding box's 3D centroid and "
                            'extent. An "object map" also\n'
                            'requires a "class_list", which is used to create '
                            'object label probability\n'
                            'distributions.\n',
             'functions': {'create': 'object_map.create_empty',
                           'create_object': 'object_map.create_empty_object',
                           'validate': 'object_map.validate'},
             'name': 'object_map'},
 'robot': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/robots_sim_omni/robots/carter_omni.yaml',
           'address': 'http://benchbot_robot:10000',
           'connections': {'image_depth': {...},
                           'image_depth_info': {...},
                           'image_rgb': {...},
                           'image_rgb_info': {...},
                           'laser': {...},
                           'move_angle': {...},
                           'move_distance': {...},
                           'move_next': {...},
                           'poses': {...}},
           'global_frame': 'map',
           'name': 'carter_omni',
           'persistent_cmds': ['/benchbot/benchbot_simulator/run -P 10001 & '
                               'x=$! &&  /isaac-sim/start_nucleus.sh ; sleep 5 '
                               '&& curl -X POST http://localhost:10001/start '
                               '&& wait $x\n',
                               'rosrun benchbot_robot_controller noisify_odom '
                               '\\\n'
                               '  noise_linear:=0.2 noise_angular:=0.1\n'],
           'persistent_status': 'curl -s localhost:10001/started | grep -q '
                                "'true'\n",
           'poses': ['base_link', 'initial_pose', 'camera_left', 'lidar'],
           'robot_frame': 'base_link',
           'run_cmd': 'rostopic pub -1 /odom_start_pose std_msgs/String "data: '
                      '\'$START_POSE\'" && curl -s -o /dev/null '
                      'localhost:10001/open_environment \\\n'
                      '  -H "Content-Type: application/json" \\\n'
                      '  -d \'{"environment": "$ENVS_PATH/$MAP_PATH"}\' &&\n'
                      'curl -s -o /dev/null localhost:10001/place_robot \\\n'
                      '  -H "Content-Type: application/json" \\\n'
                      '  -d \'{"robot": "$ROBOT_PATH/.robot_data/carter.usd", '
                      '"start_pose": "$START_POSE"}\'\n',
           'stop_cmd': 'curl -s -o /dev/null -X POST '
                       'localhost:10001/stop_sim\n',
           'type': 'sim_omni'},
 'task': {'_file_path': '/benchbot/addons/benchbot_addons/benchbot-addons/tasks_ssu/tasks/sslam_pgt.yaml',
          'actions': ['move_next'],
          'description': 'Use a Semantic SLAM algorithm to construct an object '
                         'map of the environment. An object map describes each '
                         'object with a probabilistic label suggestion, '
                         'spatial location, and optional probabilistic state '
                         'change suggestion. This task provides passive robot '
                         'control, and sensor observations with ground truth '
                         'robot pose.\n',
          'localisation': 'ground_truth',
          'name': 'semantic_slam:passive:ground_truth',
          'observations': ['image_depth',
                           'image_depth_info',
                           'image_rgb',
                           'image_rgb_info',
                           'laser',
                           'poses'],
          'results_format': 'object_map',
          'scene_count': 1,
          'type': 'sim_unreal'}}

Supervisor is now available @ 'http://0.0.0.0:10000' ...

Waiting until a robot controller is found @ 'http://benchbot_robot:10000' ... 
    Found
Sending environment data & robot config to controller ... 
    Ready
Starting the robot controller ... 
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/benchbot_supervisor/__main__.py", line 28, in <module>
    s.run()
  File "/usr/local/lib/python3.6/dist-packages/benchbot_supervisor/benchbot_supervisor.py", line 238, in run
    self._robot('/prepare')
  File "/usr/local/lib/python3.6/dist-packages/benchbot_supervisor/benchbot_supervisor.py", line 71, in _robot
    'json': data
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 917, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: [Errno Expecting value] <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
: 0

################################################################################
###################### CLEANING UP ALL BENCHBOT REMNANTS #######################
################################################################################

Sending stop request to running controller:
{"stop_success":true}

Stopped the following containers:
d1e6f2eb17d5

Deleted the following containers:
Deleted Containers:
d1e6f2eb17d5aa82d63936cf39747199fedc66936cd17844f777473c0eafb263
7f1589564ca18af1020a65f1e8e92058aeb3f1d8713aa86eddefaa117818ded4

Total reclaimed space: 189.2kB

Finished cleaning! (use 'benchbot_run -k' for a full clean)
btalb commented 2 years ago

Thanks for reporting @ffgao .

Have you got more context of what was going on when this happened:

Struggling to reproduce this locally, so any extra information you may have is helpful.

ffgao commented 2 years ago

Thank you for your help @btalb More information is below:

Below is a running record: Peek 2022-08-12 11-18

Error occurred a few minutes later: image

btalb commented 2 years ago

Hmmm.... sorry about the delay @ffgao , I'm struggling to immediately see what could be causing this (and can't reproduce locally).

I might need a bit deeper in terms of logging:

You may have to do a benchbot_run -k to clean up after this.

ffgao commented 2 years ago

No benchbot_robot_controller in the list of container.@btalb The following is my operation.

ffgao:~$ docker ps
CONTAINER ID   IMAGE                         COMMAND                  CREATED          STATUS          PORTS                                                                                                                                                                                              NAMES
372aaf8feeb4   benchbot/backend:base         "/bin/bash"              18 minutes ago   Up 18 minutes                                                                                                                                                                                                      benchbot_debug
c555e44fb73c   benchbot/simulator:sim_omni   "/bin/bash -c 'sourc…"   19 minutes ago   Up 19 minutes   3007/tcp, 3009/tcp, 3080/tcp, 3085/tcp, 3333/tcp, 8011-8012/tcp, 8080/tcp, 8211/tcp, 8888/tcp, 8891/tcp, 8899/tcp, 47995-48012/tcp, 47995-48012/udp, 49000-49007/tcp, 49100/tcp, 49000-49007/udp   benchbot_robot
40f0281b23a5   benchbot/backend:base         "/bin/bash -c 'sourc…"   19 minutes ago   Up 19 minutes                                                                                                                                                                                                      benchbot_ros
ffgao:~$ docker logs benchbot_ro
benchbot_robot  benchbot_ros    
ffgao:~$ docker logs benchbot_robot 

Robot controller is now available @ 'http://0.0.0.0:10000' ...
172.20.0.102 - - [2022-08-17 09:06:33] "GET // HTTP/1.1" 200 152 0.000461
172.20.0.102 - - [2022-08-17 09:06:34] "POST //configure HTTP/1.1" 200 137 0.170677
Preparing the requested controller ... 
FAILED TO PREPARE INSTANCE AFTER 120s. STATUS CHECK COMMANDS WAS:
    curl -s localhost:10001/started | grep -q 'true'

DUMPING LOGS FOR ALL COMMANDS... 
COMMAND:
    /benchbot/benchbot_simulator/run -P 10001 & x=$! &&  /isaac-sim/start_nucleus.sh ; sleep 5 && curl -X POST http://localhost:10001/start && wait $x

OUTPUT:
Traceback (most recent call last):
  File "<string>", line 585, in __prepare
  File "<string>", line 436, in prepare
  File "<string>", line 207, in prepare
IOError: [Errno 20] Not a directory: '/tmp/benchbot_logs/0/r'

172.20.0.102 - - [2022-08-17 09:08:34] "GET //prepare HTTP/1.1" 500 426 120.086528
david2611 commented 1 year ago

Apologies that we have been so slow in fixing problems.

I have also come across this when trying to verify a fix for #82.

If you perform a benchbot_install -b develop operation and run again you should get more meaningful output from your benchbot_robot log. Would you be able to send through what you receive so I can confirm it is the same as what I have been getting?

david2611 commented 1 year ago

While current solution is not pretty I have put a hotfix in place that seems to solve this issue for me. A fresh benchbot_install on the main branch should sort this out.

Worse case scenario I have found with the hotfix is that the first benchbot_run seems to fail but all subsequent ones succeed.

I will work to fix this bug but will open a new issue regarding this.

Can you confirm that this fixes the issues you originally encountered?

david2611 commented 1 year ago

Local testing seemed to show this issue as fixed. Closing due to inactivity