evanthebouncy commented 4 years ago

2020-08-27

we'll organise the search structure to admit easy experiments and plot generations

base_search.py is perfect the way it is, but, I would rename BaseSearch to ReplEnv, and have it handle interacting with the fusion server, and leave the logging somewhere else. For now leave it as is is fine.

we'll be modifying random_search.py, factoring it to allow easy modifications and additions of other agents and search procedures

the general idea of performing search is that, while an agent is unlikely to produce the correct reconstruction in one go, repeated usage of the agent in clever ways will increase the likelihood of reconstruction.

there are different kinds of agents:

the random agent performs the actions according to a uniform distribution (call this AgentRandom)
the supervise trailed neural network agent performs actions according to a more informative distribution (AgentSupervised)
the RL trained neural network agent (if time permits) performs actions according to a different distribution (AgentRL)

there are different kinds of search procedures:

random-rollout is a search procedure that repeatedly sample the agent's distribution for the next action
beam-search is a search procedure that keep track of the top-k sequences w.r.t. the probability of generation
stochastic beam-search is yet another search procedure (will implement if time permits)

as we can see, the agents and the search-procedures are factorized and decoupled, so that any agent can be leveraged with any search procedure (for a total of 9 combinations here).

we'll start by building one of such combinations: AgentRandom with random-rollout, keeping the two decoupled so when we get the trained NN agent, AgentSupervised, we can swap out the random-agent to get the combination of AgentSupervised with random-rollout.

Agent

Agent can be an abstract class, it needs to implement the following methods:

init(...) : whichever init you need, for the AgentSupervised you will likely have to pass in a neural network here
get_actions_prob(current_graph, target) : given a current graph, and the target graph, give two lists:
1. a list of all possible actions, where each action is a triple (start-face, end-face, operation)
2. the associated probability for each action

class RandomAgent(Agent):

    def __init__(self, target_file):

        # Store a list of the planar faces we can choose from
        self.target_faces = []
        for node in self.target_graph["nodes"]:
            if node["surface_type"] == "PlaneSurfaceType":
                self.target_faces.append(node["id"])
        assert len(self.target_faces) >= 2

        self.operations = ["JoinFeatureOperation", "CutFeatureOperation"]

    # we'll take in these arguments for consistency, even some of them might not be needed
    def get_actions_prob(current_graph, target):
        list_actions = []
        list_probabilities = []
        for t1 in self.target_faces:
            prob_t1 = 1 / len(self.target_faces)
            for t2 in self.target_faces:
                if t1 != t2:
                    prob_t2 = 1 / (len(self.target_faces) - 1)
                    for op in self.operations:
                        prob_op = 1 / 2

                        action = (t1, t2, op)
                        action_prob = prob_t1 * prob_t2 * prob_op

                        list_actions.append(action)
                        list_probabilities.append(action_prob)

        return list_actions, list_probabilities

Search Procedures

In a nutshell, a search procedure should amplify the success rate of any agent, by running it multiple times in a clever way, it needs to implement the following methods:

init(target_file) : initialize, and set the target for this particular search
get_score_over_time(agent, budget, score_function): given a particular agent, a search budget (measured in number of repl invocations, specifically, the number of "BaseSearch.extrude" function calls), and a particular scoring function (iou or complete reconstruction). return for up to each repl invocation, the best score obtained from the set of explored programs in the search

class Search:
    # move the logging functions from BaseSearch here
    # suggest to rename BaseSearch to ReplEnv
    def __init__(self, target_file):
        pass
    def get_score_over_time(self, agent, budget, score_function):
        pass

class RandomSearch(Search):

    def __init__(self, target_file):
        self.log = Log() # suggest to make a Log class and plug it in here
        self.target_file = target_file

    # ignoring score function for now
    def get_score_over_time(self, agent, budget, score_function):
        target_graph = get_target_graph(target_file)
        # the length of rollout is the same as the number of faces as a maximum
        rollout_length = len([node for node in target_graph["nodes"] of node["surface_type"] == "PlaneSurfaceType"])

        used_budget = 0
        best_score_sofar = 0
        best_score_over_time = []

        while used_budget < budget:
            # open a "fresh" ReplEnv. probably try to avoid closing fusion and opening it again as that will be inefficient
            env = get_fresh_env()
            cur_graph = env.setup(target_file)
            for i in range(rollout_length):
                actions, action_probabilities = agent.get_actions_prob(cur_graph, target_graph)
                sampled_action = numpy.random.choice(actions, 1, p=action_probabilities)
                cur_graph, cur_iou = env.extrude(sampled_action)
                # do some logging
                best_score_sofar = max(best_score_sofar, cur_iou)
                best_score_over_time.append(best_score_sofar)
                used_budget += 1

        # again, this should be done with some logging, but I'm explitictly returning it for now
        return best_score_over_time

Now we have the best_score_over_time for a particular target_file, of a particular search procedure, with a particular agent. Let us generate the plot for it. You'll probably have to adapt it a bit to fit with the rest.

search_budget = 100
best_over_time_all_tasks = []

for target_file in all_task_files:
    random_agent = RandomAgent(target_file)
    random_search = RandomSearch(target_file)
    score_over_time = get_score_over_time(random_agent, search_budget, None)
    best_over_time_all_tasks.append(score_over_time)

# now the list is organized as all iou of all tasks on step 0, then all ious on step 1, etc
best_over_time_all_tasks = list(zip(*best_over_time_all_tasks))
means = [numpy.mean(x) for x in best_over_time_all_tasks]
stds = [numpy.std(x) for x in best_over_time_all_tasks]

import matplotlib.pyplot as plt

# Build the plot
fig, ax = plt.subplots()

ax.bar([_ for _ in range(search_budget)], means, yerr=stds,)

karldd commented 4 years ago

This sounds great. Having clear interfaces between them will make things smooth 👍

evanthebouncy commented 4 years ago

w.r.t. filtering out "clearly bad actions", go to this line and add something after it:

actions, action_probabilities = agent.get_actions_prob(cur_graph, target_graph)
# add these lines below
clearly_bad_actions = get_clearly_bad_actions(actions, cur_graph, target_graph) # this should be sufficient to filter out obviously bad actions
# set the probability for those clearly bad actions to 0
for bad_action in clearly_bad_actions:
    action_probabilities[actions.index(bad_action)] = 0.0
# re-normalize the probabilities so they sum to 1 again
action_probabilities = action_probabilities / sum(action_probabilities) # this array broadcasting will only work if action_probabilities is a numpy array

as noted above, we should probably be using numpy arrays instead of python arrays, so change agent's code as following as well

return list_actions, np.array(list_probabilities)

AutodeskAILab / Fusion360GalleryDataset

structuring /tools/search/ 1) random roll-out #22

2020-08-27

Agent

Search Procedures