meta-control / mc_mros_reasoner

library for metacontrol-based self-adaptation using ontological reasoning, with wrappers for robotic systems based on ROS1 and ROS2
Apache License 2.0
7 stars 10 forks source link

The reasoner doesn't work with 'high' reasoning rates #165

Closed Rezenders closed 9 months ago

Rezenders commented 1 year ago

When I set the reasoning period to be 1 or less, I get a lot of bugs.

An example:

[pipeline_inspection_reasoner-1] [ERROR] [1673354688.878945832] [pipeline_inspection_reasoner]: Error raised in execute callback: list index out of range
[pipeline_inspection_reasoner-1] Traceback (most recent call last):
[pipeline_inspection_reasoner-1]   File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/action/server.py", line 324, in _execute_goal
[pipeline_inspection_reasoner-1]     execute_result = await await_or_execute(execute_callback, goal_handle)
[pipeline_inspection_reasoner-1]   File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/executors.py", line 107, in await_or_execute
[pipeline_inspection_reasoner-1]     return callback(*args)
[pipeline_inspection_reasoner-1]   File "/home/gus/pipeline_ws/build/mros2_reasoner/mros2_reasoner/ros_reasoner.py", line 156, in objective_action_callback
[pipeline_inspection_reasoner-1]     fg_instance = self.onto.search_one(solvesO=objective)
[pipeline_inspection_reasoner-1]   File "/home/gus/.local/lib/python3.10/site-packages/owlready2/namespace.py", line 395, in search_one
[pipeline_inspection_reasoner-1]     def search_one(self, **kargs): return self.search(**kargs).first()
[pipeline_inspection_reasoner-1]   File "/home/gus/.local/lib/python3.10/site-packages/owlready2/util.py", line 62, in first
[pipeline_inspection_reasoner-1]     if len(self) != 0: return self[0]
[pipeline_inspection_reasoner-1]   File "/home/gus/.local/lib/python3.10/site-packages/owlready2/util.py", line 192, in __getitem__
[pipeline_inspection_reasoner-1]     return self[i]
[pipeline_inspection_reasoner-1] IndexError: list index out of range
[pipeline_inspection_reasoner-1] [WARN] [1673354688.880966255] [pipeline_inspection_reasoner]: Goal state not set, assuming aborted. Goal ID: [ 77  12 182 243  76 147  77 121 161  64   2  40  60 160  33  37]

Or

[pipeline_inspection_reasoner-1] [ERROR] [1673533577.337098677] [pipeline_inspection_reasoner]: Error in perform_reasoning: 'NoneType' object is not subscriptable

I will add the other error messages here later

Rezenders commented 1 year ago

[pipeline_inspection_reasoner-1] [ERROR] [1673533577.337098677] [pipeline_inspection_reasoner]: Error in perform_reasoning: 'NoneType' object is not subscriptable

This error is also happening with lower rates, but less frequently. Unfortunately, I don't have any more info. My guess is that is has something to do with reading/writing from/to the ontology.

Rezenders commented 1 year ago
[pipeline_inspection_reasoner]: In Analyze, exception returned: 'NoneType' object is not subscriptable
alexander-gabriel commented 1 year ago

This looks like a concurrency error. That should be easy enough to test.

If it is a concurrency error, there is some guidance:

https://owlready2.readthedocs.io/en/latest/sync.html?highlight=concurrent#synchronization says:

  1. Open the quadstore in non-exclusive mode (exclusive = False in set_backend()).
  2. Perform each modification to an ontology inside a “with ontology:” block. Owlready maintain a lock for each quadstore, which prevents multiple writes at the same time. Thus, for improving performances, you should also avoid long computation inside “with ontology:” blocks.
  3. Call World.save() at the end of each “with ontology:” block, in order to commit the changes to the quadstore database.

that means you'd have to adjust https://github.com/meta-control/mc_mros_reasoner/blob/50a67fb5bc73e01a963a3b77388b49d024d6feac/mros2_reasoner/mros2_reasoner/tomasys.py#L18

like so

world = World()
world.set_backend(filename = "/path/to/your/file.sqlite3", exclusive = False)

and also adjust the the rest that accesses the backend according to points 2 and 3...

alexander-gabriel commented 1 year ago

that still would not allow you to run the thing faster though, just with more waiting and fewer bugs.

Regardless of whether or not your current problem is caused by concurrency...in the context of ROS that's always a problem. So this should be concurrency safe anyway.