Closed mlb2251 closed 2 years ago
Proposal: remove the "multiarg/multiuse" move from expansion so instead at each step we choose a hole and case-split it on whether it's an App (of 2 new holes) a Lam (of a hole) or a Prim/Var (in which case we case-split on the exact value). This splits the space into disjoint subsets of usage locations. Note that splitting on the exact value of a Prim/Var is the same as the idea of building out the body of an invention (without this we would just be looking at the app/lam structure). The splitting is safe and still enumerates all possible programs because case-splits are safe in general: you produce the set of all possible cases for a given hole, so you don't lose anything.
Also, at each step we would take a pattern with a frontier of holes and fully do multiarg/multiuse/refinement in the moment to get a bunch of finished inventions.
Why is it better to do multiarg/multiuse/refinement at the end?
Approaches to concretizing at the end:
This builds on the previous two ideas. We will be first doing a largely arity-blind search for patterns with frontiers of holes, but with a few differences:
arg_choice
which is one of the branches of expanding any hole. This is the branch where this node is used as an ivar in the final invention (but the index number is not yet assigned)010
means the first and third argchoices were assigned to #0
and the second argchoice was assigned to #i
.101
and 010
are identical inventions so we canonicalize: The first 0 always comes before the first 1 in the string and so on. And 1
is not a valid length one thing only 0
is etc.010
we first assign 0
to the first node, then we assign 1
to the second node, then we assign 0
to the third node at which point we subset the locations to the ones where argchoice[0] == argchoice[2]
which is actually really easy if you maintain an separate vec for each argchoice with length equal to the length of the subset. Then you can just zip them with the utilities list (of length equal to locations list) and filter on equality between the argchoice arrays and sum the result.00
as an assignment and the subsetting got you such a low score that you were below the upper bound, you can stop without exploring any extensions of that assignment. That prefix alone subsets it enough to lower the upper bound.00
will result is an even smaller subset within the parents 00
subset! And since we know that locally each of the utilities for nodes in that subset will only have gotten smaller in offspring, we know that this prefix will fail the upper bound test for this offspring too. This means parents need only to pass along the prefix reject list to their children, and the child can just skip over that prefix when it constructs it, and the child adds onto the prefix list as more things are discovered. Note a child might newly find something like 00
is not allowed even though it was in the parent etc ofc.Initial worklist is just the single-hole partial invention
loop:
num_expansions*locations
or prims*locations
or something like that. Basically the goal here is to prioritize large things that are used in many places.no_opt_useless_abstract
)no_opt_force_multiuse
i think we already capture that during assignment?worklist_buf
, however you should continue to search through the rest of the things to record warning prefixes for the future.Oh also dont allow an expansion into a lambda on the very very first hole - we dont allow inventions with a lambda at the top
implemented in #92 with great success wahoo
Summary
zid -> list of subsets
which gives the subsets of the full nodeset and therefore you can filter these for the nodes in your usage set to get the real subsets in this case.main
for this where you cast it as refinement.Visual
Comparison to old approach
Upper bounding
Multiuse bounds
Calculating subsets
Search order
size * usages
or something like that, which seems like it could pretty quickly bring you towards the best inventionEntropy in Search Order
Preliminary tests on multiarg upper bound weakening
159 -> 516
partial inventions; still getting a lot of upper bound and single use pruning).Multiuse Ideas
Open Qs:
How should expanding something work when it might expand into an app or lam or prim?
It also really feels like maybe in an arity-free way you can find these largest shared sections, then just start trying them out in decreasing order