Run time of task assignment varies significantly

kfu0115 commented 6 years ago

I am doing a performance analysis on branch and bound. And for the same scenario, it varies. The more tasks I have, the wider the range. Take the case of 2 vehicles with 72 line search tasks for example, the assignment time can be 40.394 or 80.916 seconds. Is there a good explanation for that? I see something the difference being caused by messaging. And it seems for the branch and bound algorithm, it seems in difference simulations, different number of nodes are created. Is that another reason?

Can you also give me some reference of how branch and bound works here? The branch and bound algorithm for task assignment I know is N to N allocation, meaning number of tasks and number of assignee have to be the same to form a N by N cost matrix. How does BnB work in UxAS when you have like 2 vehicles and 8 line tasks. Is the number of options being the number of initial node? Some reference for the Branch and Bound implementation in UxAS would be a great help.

derekkingston commented 6 years ago

The task assignment proceedure is deterministic, so for the same inputs/states and the same tasks, the computational steps should be the same. So, there should be exactly the same set of messages exchanged, the same number of nodes explored, etc. I can't think of any reason for a 2x increase in assignment time given the same situation. Is this noticed in subsequent full simulations with AMASE? Rather than run AMASE, you can configure the SendMessagesService to inject just the messages you need and guarantee that the same states and tasks are used in the computation.

However, if you have different initial conditions (aircraft positions), the tree will be searched in a different direction since it always searches in a greedy manner and a change in initial aircraft position will change which task is closest and therefore assigned first and consequently how the rest of the tree is built.

An explanation of/reference to the branch and bound tree search can be found at the OpenUxAS release page. You may also find the paper there on Process Algebra useful. Note, branch and bound is a technique for pruning the tree to ensure that the parts of the search space that can't possibly improve the current best solution will not be explored. Otherwise, it's a traditional depth-first search.

To step through the assignment proceedure, consider the following labeling scheme: let V_X -> {T_Y}_Z indicate that vehicle X is assigned to task Y using option Z. For example V_2 -> {T_3}_1 means that vehicle 2 is assigned to perform task 3 with option 1.

Starting with 0 tasks assigned, consider the time for each vehicle to go from its initial position to every task/option pair and choose the minimum time, i.e. the first assignment to be placed on the tree is the solution to min_{n=1..N} min_{t=1..T} min{o=1..O} J(V_n -> {T_t}_o) where J(.) is the cost for that assignment (the time to complete). Assuming 2 vehicles and 8 line search tasks each with 2 options, this would be min_{n=1..2} min_{t=1..8} min{o=1..2} J(V_n -> {T_t}_o). Essentially, this means choose the shortest route from the inital vehicle positions to any of the the task options. Once a task has been assigned, it is removed from consideration for all child nodes deeper in the tree. The bottom of the tree, a leaf node, will be when the final task is assigned and there are no more to consider.

For illustration purposes consider a simple example with 2 vehicles and 3 tasks (single options). An example tree is as follows:

                                        NULL
                                         |
                                         |
    ----------------------------------------------------------------------------
    |              |              |              |              |              |
V_1 -> T_1     V_1 -> T_2     V_1 -> T_3*    V_2 -> T_1     V_2 -> T_2     V_2 -> T_3
                                  |
                                  |
                   ----------------------------------------------
                   |              |              |              |
               V_1 -> T_1     V_1 -> T_2     V_2 -> T_1*    V_2 -> T_2
                                                 |
                                                 |
                                  ----------------
                                  |              |
                              V_1 -> T_2     V_2 -> T_2*

Which can be interpreted as: vehicle 1 is assigned to task 3 while vehicle 2 is assigned to complete task 1 and then task 2.

Once the bottom of the tree is reached, a feasible assignment is in hand and an upper bound on the total cost of the assignment can be calculated. This cost is then used as the search continues in a depth-first manner ignoring any nodes that have a cost greater than the current best solution cost. If a new leaf node is discovered at cheaper cost, the "best solution cost" is updated and the search continues. Once the tree is completely searched or the maximum number of nodes has been visited, the best solution found is returned.

OpenUxAS allows the maximum number of nodes to be searched to be enforced, however, it must first find a feasible solution (reach the bottom of the tree). For large problems, a greedy search should still be very fast, but the route planning that informs the search of the cost-to-go for each vehicle-task-option could explode. For 2 vehicles and 72 line search tasks, the route planner must consider (72*2*2)*(72*2*2) + (72*2*2) = 83,232 routes which even at a 1ms route planning time would take 80s just to build the inputs to the tree search.

kfu0115 commented 6 years ago

Use the tree in the example, I have a few questions.

So in the top layer, you have n=2* total options node. And that's the number of function calls of Expand_node() right?
And then, you search the node with the smallest cost, and expand/branch from that node, right? Then, you need to call Expand_node() 4 times again in that second layer? (I wrote a profiler to trace function calls and see how expand_node being called). If that is the case, why? since you have cost of each node calculated already in the first layer. but here again, you find the node with the minimum cost and branch again from that node, right.
You repeat the process until you reach the bottom of tree then you get a feasible assignment. Isn't that supposed to be the end of the algorithm? You said the assignment you got now is used to calculate the upper bound and used for a new search. Don't you have a lowest cost already after you reach the bottom of the tree? how do you conduct the search going after you've reached the bottom of the tree? go to the top layer and find another node that has lower cost than upper bound and repeat the process? do you use a new cost function instead of using shortest time?

derekkingston commented 6 years ago

Correct. At the top of the tree, each vehicle (in this case 2) is considered for assignment to each task (in this case 3). If tasks each had options, then those options would also have to be included. For simplicity in this example, each task has 1 option, so the total number of considerations is: 2 x 3 x 1 = 6 (2 vehicles x 3 tasks x 1 option/task).
Correct. The decision at the first layer is the smallest cost among those 6 choices. This selects the node to expand at the next layer. Since task 3 is already assigned at the top layer, it is not included in consideration at this next layer. So, now each vehicle (2 vehicles) should be considered for completing the remaining tasks (2 remaining tasks). Again, only single options per task, so the total number of considerations at this layer is: 2 x 2 x 1 = 4 (2 vehicles x 2 remaining tasks x 1 option/task).

A subtle point to keep in mind: the top layer considers the cost-to-go from the initial positions of the vehicles. Once a vehicle has been assigned at a layer, it needs to be considered at the subsequent layer as if it had just finished the assigned task. Task/option pairs indicate the start position, the cost of completing the task, and the resulting end position, so the cost for considering a vehicle assigned at an earlier layer is the cumulative cost over the entire trace through the tree.

For our simple example, consider the following cost matrix:

	T_1	T_2	T_3
V_1 (from initial)	8	4	1
V_2 (from initial)	2	8	6
V_1 (from T_1)	n/a	6	2
V_2 (from T_1)	n/a	5	1
V_1 (from T_2)	2	n/a	5
V_2 (from T_2)	3	n/a	8
V_1 (from T_3)	6	8	n/a
V_2 (from T_3)	5	8	n/a

Then the total running cost of the assignment is: J({V_1 -> T_3}) = 1 J({V_1 -> T_3}) + J({V_2 -> T_1}) = 1 + 2 = 3 J({V_1 -> T_3}) + J({V_2 -> T_1}) + J({V_2: T_1 ->T_2}) = 1 + 2 + 5 = 8

So, even though some of the costs remain the same from a previous layer, any cost associated with the assigned vehicle will change. One could keep all those costs from a previous layer that are still applicable at the next layer, replacing only those that must be updated. Since we look these costs up from a table and the process algebra relationship may prune or open up new tasks, it's simpler to simply re-look up costs from the table at node expansion.

This process is repeated until the bottom of the tree is reached, being careful to note that the total cost is the sum of costs at each stage. Once you reach the bottom of the tree, you have a feasible assignment: all tasks have been assigned. However, because each layer considered only the immediate one-step cost there may be another solution that has a high cost step that then opens up low cost areas lower in the tree. The feasible solution is an upper bound - there may be other solutions that have lower total cost, but since you have a feasible solution in hand, you can be sure that the problem can be solved with at most this cost.

In the example cost matrix, you can see that the actual optimal solution is: {V_2 -> T_1}, V_2: {T_1 -> T_3}, {V1 -> T_2} which has cost 2 + 1 + 4 = 7.

Once the bottom of the tree is reached, the search proceeds "backwards" in the sense that it goes back up layer by layer and reconsiders all nodes that it hasn't expanded only if the running cost of expanding that node doesn't exceed the current best upper bound cost.

In many cases, the solution obtained by just stopping when first reaching the bottom (i.e. the greedy solution) is a close approximation to the overall optimal. However, there is no guarantee - the greedy solution can be arbitrarily bad in the theoretical case. Generally, it provides a good starting point from which to start a larger search or more complex search.

kfu0115BAE commented 6 years ago

So, when you say the search proceeds "backwards". In this example, it explores all other nodes in the last layer ( in this case, only V_1->T_2) and picks the one with the smallest cost based on the cost matrix you already have, right? So it will be (V_1->T2) from the buttom layer. Since T2 is picked, then pick between V2_T1 and V1_T1 (from T2? or from T3) in the second layer. V2_T1 has cost 2. V1_T1 also has cost 2. But V2_T1 has been explored already when we found the feasible assignment, we go V1_T1? it doesn't look like the optimal solution you gave. So, after we start from V1_T2, we branch again with 4 nodes that contains T1 and T3 right? If we have more than 2 vehicles, after we reach the bottom of the second time (if another lower cost path is found), we explore the unvisited node of that new bottom layer and repeat the process again, right? Thanks!!!

derekkingston commented 6 years ago

By backwards, I mean from the bottom to the top. As you note, this would mean considering V_1 -> T_2 in place of V_2 -> T_2 assuming the rest of the tree above it still holds. The bottom layer is a special case since you've already exhaustively considered all possibilities with that state of the tree. In this example, the next interesting assignment to reconsider is V_2 -> T_1.

Here's the sequence of building and searching the tree:

assignment	running cost	cost bound
NULL	0	n/a
{V_1 -> T_3}	1	n/a
{V_1 -> T_3}, {V_2 -> T_1}	3	n/a	n/a
{V_1 -> T_3}, {V_2 -> T_1}, V_2: {T_1 -> T_2}	8	8

{V_1 -> T_3}, V_1: {T_3 -> T_1}	6	8
{V_1 -> T_3}, V_1: {T_3 -> T_1}, V_1: {T_1 -> T_2}	12	8
{V_1 -> T_3}, V_1: {T_3 -> T_2}	9	8
{V_1 -> T_3}, {V_2 -> T_2}	9	8

{V_2 -> T_1}	2	8
{V_2 -> T_1}, V_2: {T_1 -> T_3}	3	8
{V_2 -> T_1}, V_2: {T_1 -> T_3}, {V_1 -> T_2}	7	7

{V_2 -> T_1}, V_2: {T_1 -> T_2}	7	7

{V_1 -> T_2}	4	7
...	...	...

Does this make sense?

kfu0115BAE commented 6 years ago

This sequence explains everything: You get your upper bound 8 first. Then you go back to V1_T3 to see if there is any alternative that gives a lower cost. Once you reach a point having higher cost than the current bound, the search stops. Once all other alternatives of V1_T3 are exhausted, you pick the second lowest initial node V2_T1, and eventally get a new bound 7. Then keep this procedure going.

How can I define the maximum number of node searching? I ran into cases (diffenet arrangements of 96 lines, but 97 lines are OK) that just keep expanding nodes for hours that I ended up terminating the process. I am not sure if the first upper bound had been found for these cases. But it would be nice if we can limit the total time and return the best soltion found by then. Thanks so much.

kfu0115 commented 6 years ago

128linearray

It seems not all tasks are assigned. In this case, I have an 8 by 16 array of line tasks. I pick 127 of them to task assignment. As a result, 38 tasks were not assigned. Does it mean the search stopped sometime and it returns whatever tasks already assigned? if that's the case, isn't it better to return the result of the first bound or whatever new bound that is reached so we have all tasks assigned?

derekkingston commented 6 years ago

The branch and bound service should default to returning the greedy solution only - that is, unless you've configured it differently, it will return immediately the first time it reaches the bottom of the tree. To change the number of nodes it will search before returning, use the NumberNodesMaximum paramter (-1 = full search, 0 = greedy only [default], >0 = search that many nodes and then return best solution).

So, to configure the search to return an answer after 10000 nodes searched, configure the service as:

<Service Type="AssignmentTreeBranchBoundService" NumberNodesMaximum="10000" CostFunction="MINMAX" />

derekkingston commented 6 years ago

In your test case, you have 2 vehicles, 127 tasks and each task has 2 options. So the total number of routes that the route planner will attempt to calculate is: (2*127*2)*(2*172*2 + 1) = 258,572. However, the LMCP message for the cost matrix has a max list size of 2^16 -1 = 65,535. So you are overflowing the cost matrix message and only getting some of the options considered.

UxAS needs to be updated to report this error, but in the mean time, you can increase a list size in the LMCP messages by adding the LargeArray="true" parameter, see large_array_mdms.zip

Although you can modify the messaging system to work around the overflow, you do need to consider whether this is the right technique for very large numbers of tasks (or vehicles for that matter). With 200k+ calls to the route planner, just generating the cost matrix will be very time consuming. UxAS was intended for a single operator controlling a handful of autonomous vehicles with a limited set of tasks to be completed.

afrl-rq / OpenUxAS-SoI

Run time of task assignment varies significantly #64