equinor / flotilla

Flotilla is the main point of access for operators to interact with multiple robots in a facility.
Eclipse Public License 2.0
16 stars 35 forks source link

Robot stuck in busy after missions complete quickly #1387

Closed andchiind closed 3 months ago

andchiind commented 7 months ago

Describe the bug When missions are completed near instantly in ISAR, the state in Flotilla becomes inconsistent. It displays that the robot is busy and therefore does not start any new missions after the last completes. This happens even though the robot is Available and the mission status is Idle in ISAR. This is particularly problematic when it comes to Isar instances which ignore localization.

The main issue seems to be that we set the robot to busy manually, even though the robot has reported that it is available. This is because we set it to busy as soon as we try to start a mission, to prevent more than one mission being scheduled, but sometimes this happens after Isar has returned its response to the mission being started, at which point the mission is already done. We therefore then wait for a message from Isar saying that the robot is available, even though this event has already been received.

To Reproduce Remove the sleeps in initiate_mission and initiate_step in isar_robot, then run any mission with a single task (such as localization, but using not-stepwise might also work for missions with more tasks).

Expected behavior The hearbeat messages from Isar should either be able to correct the state, or the other event handlers in Flotilla need to be less reliant on events arriving in a specific order from ISAR.

Screenshots image image

UsamaEquinorAFK commented 7 months ago

Unassigning myself becuase of pending discussion.

oysand commented 6 months ago

When specifying the STEP_DURATION_IN_SECONDS in isar-robot below the ROBOT_STATUS_PUBLISH_INTERVAL in ISAR, the robot status was not sent to flotilla. When increased above this level the status was sent to flotilla

oysand commented 6 months ago

This sleep makes it so that if the entire step completes within this time it does not send anything since the combined status is the same as the previous robot status

Image

andchiind commented 3 months ago

Closed until we re-encounter the issue. Suspected solution implemented