Learning-and-Intelligent-Systems / predicators

Learning for effective and efficient bilevel planning
MIT License
88 stars 15 forks source link

implement parallelized version of hill climbing #264

Closed tomsilver closed 2 years ago

tomsilver commented 2 years ago

we can evaluate all successors in parallel, which will help when the heuristic evaluation is the bottleneck (which it is for predicate learning)

ronuchit commented 2 years ago

i can look into this but i think we might find that the overhead is too great

tomsilver commented 2 years ago

@ronuchit I actually looked into it a bit already and found some modest speedups, not as much as you'd hope, but like 2-3x I think when parallelizing over 12 CPUs, and when the score function is significantly expensive. Here is the code snippet I was using if you do look into it:

import pathos.multiprocessing as mp  # note that you need to use pathos for the same reason we need to use dill

def run_hill_climbing(initial_state: _S,
                      check_goal: Callable[[_S], bool],
                      get_successors: Callable[[_S], Iterator[Tuple[_A, _S,
                                                                    float]]],
                      heuristic: Callable[[_S], float],
                      enforced_depth: int = 0) -> Tuple[List[_S], List[_A]]:
    """Enforced hill climbing local search.

    For each node, the best child node is always selected, if that child is
    an improvement over the node. If no children improve on the node, look
    at the children's children, etc., up to enforced_depth, where enforced_depth
    0 corresponds to simple hill climbing. Terminate when no improvement can
    be found.

    Lower heuristic is better.
    """
    assert enforced_depth >= 0
    cur_node: _HeuristicSearchNode[_S, _A] = _HeuristicSearchNode(
        initial_state, 0, 0)
    last_heuristic = heuristic(cur_node.state)
    visited = {initial_state}
    print(f"\n\nStarting hill climbing at state {cur_node.state} "
          f"with heuristic {last_heuristic}")
    while True:
        if check_goal(cur_node.state):
            print("\nTerminating hill climbing, achieved goal")
            break
        best_heuristic = float("inf")
        best_child_node = None
        current_depth_nodes = [cur_node]
        for depth in range(0, enforced_depth + 1):
            print(f"Searching for an improvement at depth {depth}")
            # This is a list to ensure determinism. Note that duplicates are
            # filtered out in the `child_state in visited` check.
            successors_at_depth = []
            for parent in current_depth_nodes:
                child_nodes = []
                for action, child_state, cost in get_successors(parent.state):
                    if child_state in visited:
                        continue
                    visited.add(child_state)
                    child_path_cost = parent.cumulative_cost + cost
                    child_node = _HeuristicSearchNode(
                        state=child_state,
                        edge_cost=cost,
                        cumulative_cost=child_path_cost,
                        parent=parent,
                        action=action)
                    successors_at_depth.append(child_node)

            if not successors_at_depth:
                print("\nTerminating hill climbing, no more successors")
                break

            num_cpus = mp.cpu_count()
            fn = lambda n: (heuristic(n.state), n)

            with mp.Pool(processes = num_cpus) as p:
                for child_heuristic, child_node in p.imap_unordered(fn,
                    successors_at_depth):
                    if child_heuristic < best_heuristic:
                        best_heuristic = child_heuristic
                        best_child_node = child_node

            # Some improvement found.
            if last_heuristic > best_heuristic:
                print(f"Found an improvement at depth {depth}")
                break
            # Continue on to the next depth.
            current_depth_nodes = successors_at_depth
            print(f"No improvement found at depth {depth}")
        if last_heuristic <= best_heuristic:
            print("\nTerminating hill climbing, could not improve score")
            break
        cur_node = best_child_node
        last_heuristic = best_heuristic
        print(f"\nHill climbing reached new state {cur_node.state} "
              f"with heuristic {last_heuristic}")
    return _finish_plan(cur_node)
ronuchit commented 2 years ago

This looks good at a glance. Why didn't you want to make it into a PR? Couple of questions though: 1 - will it eat up the entire processor on our Macs? 2 - did you verify the scores are the same as in sequential mode?

tomsilver commented 2 years ago

I just didn't make a PR because I didn't have time yet to test it (so no I didn't do 2 yet). Re: 1, my computer didn't seem to slow down, but we should see.

ronuchit commented 2 years ago

i can take over on it later