Problem

Try to imagine a scenario in the future that several unmanned vehicles are scheduled to pick up and transport passengers. The map of the city is stored in each vehicles. The problem is how to plan the global optimal (or near optimal) path for each vehicles so that all the passengers do not need to wait too long. Also, we don't want the vehicles to waste their energy on unnecessary wandering.

We define several steps that can help us to solve this problem gradually.

Step 1

In the first step, we will get rid of time and there is only one agent. All passengers are waiting on their positions at the beginning. They don't care about when they will be picked up, in other words, they will wait forever. Obviously, it is a classic planing problem. The optimal path can be planned by A* under these conditions.

We have n passengers P1 . . .Pn. For each i ∈ [1,n], there exist a start position Si and a goal position Gi. A solution to this problem is a plan π which is a sequences of actions such that π allow agent A to carry Pi from Si to Gi for each i ∈ [1,n]. The agent can perform one of seven actions each time step. The agent can moves to one of the four sides of his current location on the grid or the agent waits and stays at the same location or the agent pick up or drop passengers.

Step 2

In this step, we will put several agents in the scenario, so we will consider the real multi-agent problem. We will try to use different algorithms to make best schedule, which can minimise each passenger's waiting time, and shortest path, which save most fuel. Meanwhile, the agents need to avoid traffic accident.

We have n passengers P1 ... Pn. For each i ∈ [1,n], there exist a start position Si and a goal position Gi. A plan π consisting of n sequences π1 , . . . πm of move/wait actions such that π allows agent A1 . . . Am to carry Pi from Si to Gi for each i ∈ [1,n]. During each time step, each agent can perform one of seven actions listed above.

Step 3

Finally, we take time into consideration. We will try to enable the passengers to make their orders at any time they want. Thus, our agents need to replan their schedules at any possible time slots. Also, we introduce the concept of time window - if a passenger wait too long, he/she may lose patience then cancel the order. The agent's goal is to make every passengers happy if possible. If not, minimise the lose of passengers, meanwhile, minimise the consumption of energy.

We have n passengers P1 . . . Pn. For each i ∈ [1,n], there exist a start position Si, a goal position Gi, a waiting start time step TSi and waiting dead line time step TDi. A plan π consisting of n sequences π1 , . . . πm of actions such that π allows agent A1 . . . Am to pick up Pi between time step TSi and TDi and carry Pi from Si to Gi for each i ∈ [1,n]. During each time step, each agent can perform one of seven actions listed above.

Plan validation

To decide whether a plan is valid/optimal, we need to consider:

K-robust The plan must be K-conflict free. K-conflict occurs iff there exists ∆ ∈ [0,k] such that two or more agents are located in the same location in time steps t and t + ∆.
The loss of passengers As aforementioned, the passengers will cancel their orders when they lose patiences. It is the most important metric in step 3. The plan should minimise the loss of passengers then other metics matter.
Total waiting time of passengers We don't want our passengers to wait too long, even though the waiting time is within their tolerance.
Total path length This metric reflects the energy consumption by all unmanned vehicles.
Others E.g. The variance of waiting time in different passengers.

Map the problem into Minecraft

We create a map representing a city road network, which will be stored in the agents' memory as grid.
Passengers can be represented by blocks located at any positions in the map. They can be picked up by the agents then dropped when they arrive the destination.
Players (agents) in the Minecraft play the role of unmanned vehicles in the real scenario, they perform one of the seven possible actions at each time step.

Metrics

Completeness The method must guarantee to find a solution when there is one
Optimality The method must return the optimal method if it exists.
Time complexity It is measured in the number of states(nodes) that being expanded during problem solving. Obviously, the less nodes we need to expand, the less time we need to get the result.
Space complexity The memory the method need to use to solve the problem.
Scalability Including scalability of agents/passengers/size of map.

Hypothesis

It is a combination of Multi-agent Path Finding (MAPF) problem and scheduling/transportation problem. We can create smart AI to plan the global optimal (or near optimal) path for each agents.

Questions

Can we adjust the algorithms that perform well in classic MAPF problem to the problem we define?
Which algorithm perform the best in this problem?
Can we improve the performance of algorithms in any possible way? Or can we introduce any new algorithms to solve this problem?

Methods in previous works

We have found some previous works which focus on MAPF problem. We will adjust them to the problem we define and compare the performance among all implementation and get knowledge.

A based solution The A family of algorithms are natural solvers for MAPF. It is the benchmark for other algorithms.
Standley’s enhancements Three methods that substantially improve the basic A* setting were introduced by (Standley 2010).

Independence detection (ID) The basic idea of ID is to divide the agents into independent groups. Two groups of agents are independent if there is an optimal solution for each group such that the two solutions do not conflict. Conflict voidance table (CAT) The CAT stores the location and time of every agent in every group. Then, when an MAPF solver is applied for a given group, ties between nodes with the same f-value are broken in favor of the node that has fewer entries in the CAT. Operator decomposition (OD) While the former two ideas are general and can be used by any MAPF solver, OD is specific for an A*-based solver. OD reduces the branching factor by introducing intermediate states between the regular states. Intermediate states are generated by applying an operator to a single agent only.

Conflict based search solution CBS is a two-level algorithm. At the high level, a search is performed on a tree based on conflicts between agents. At the low level, a search is performed only for a single agent at a time. In many cases this reformulation enables CBS to examine fewer states than A* while still maintaining optimality.

bahriddin / DeepMine

Muti-Agent Scheduling and Path Finding #1