Closed tomsilver closed 2 years ago
i can look into this but i think we might find that the overhead is too great
@ronuchit I actually looked into it a bit already and found some modest speedups, not as much as you'd hope, but like 2-3x I think when parallelizing over 12 CPUs, and when the score function is significantly expensive. Here is the code snippet I was using if you do look into it:
import pathos.multiprocessing as mp # note that you need to use pathos for the same reason we need to use dill
def run_hill_climbing(initial_state: _S,
check_goal: Callable[[_S], bool],
get_successors: Callable[[_S], Iterator[Tuple[_A, _S,
float]]],
heuristic: Callable[[_S], float],
enforced_depth: int = 0) -> Tuple[List[_S], List[_A]]:
"""Enforced hill climbing local search.
For each node, the best child node is always selected, if that child is
an improvement over the node. If no children improve on the node, look
at the children's children, etc., up to enforced_depth, where enforced_depth
0 corresponds to simple hill climbing. Terminate when no improvement can
be found.
Lower heuristic is better.
"""
assert enforced_depth >= 0
cur_node: _HeuristicSearchNode[_S, _A] = _HeuristicSearchNode(
initial_state, 0, 0)
last_heuristic = heuristic(cur_node.state)
visited = {initial_state}
print(f"\n\nStarting hill climbing at state {cur_node.state} "
f"with heuristic {last_heuristic}")
while True:
if check_goal(cur_node.state):
print("\nTerminating hill climbing, achieved goal")
break
best_heuristic = float("inf")
best_child_node = None
current_depth_nodes = [cur_node]
for depth in range(0, enforced_depth + 1):
print(f"Searching for an improvement at depth {depth}")
# This is a list to ensure determinism. Note that duplicates are
# filtered out in the `child_state in visited` check.
successors_at_depth = []
for parent in current_depth_nodes:
child_nodes = []
for action, child_state, cost in get_successors(parent.state):
if child_state in visited:
continue
visited.add(child_state)
child_path_cost = parent.cumulative_cost + cost
child_node = _HeuristicSearchNode(
state=child_state,
edge_cost=cost,
cumulative_cost=child_path_cost,
parent=parent,
action=action)
successors_at_depth.append(child_node)
if not successors_at_depth:
print("\nTerminating hill climbing, no more successors")
break
num_cpus = mp.cpu_count()
fn = lambda n: (heuristic(n.state), n)
with mp.Pool(processes = num_cpus) as p:
for child_heuristic, child_node in p.imap_unordered(fn,
successors_at_depth):
if child_heuristic < best_heuristic:
best_heuristic = child_heuristic
best_child_node = child_node
# Some improvement found.
if last_heuristic > best_heuristic:
print(f"Found an improvement at depth {depth}")
break
# Continue on to the next depth.
current_depth_nodes = successors_at_depth
print(f"No improvement found at depth {depth}")
if last_heuristic <= best_heuristic:
print("\nTerminating hill climbing, could not improve score")
break
cur_node = best_child_node
last_heuristic = best_heuristic
print(f"\nHill climbing reached new state {cur_node.state} "
f"with heuristic {last_heuristic}")
return _finish_plan(cur_node)
This looks good at a glance. Why didn't you want to make it into a PR? Couple of questions though: 1 - will it eat up the entire processor on our Macs? 2 - did you verify the scores are the same as in sequential mode?
I just didn't make a PR because I didn't have time yet to test it (so no I didn't do 2 yet). Re: 1, my computer didn't seem to slow down, but we should see.
i can take over on it later
we can evaluate all successors in parallel, which will help when the heuristic evaluation is the bottleneck (which it is for predicate learning)