microsoft / prose

Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
https://microsoft.github.io/prose/
Other
620 stars 100 forks source link

Disjunctive spec clarification question #42

Closed gtanzer closed 4 years ago

gtanzer commented 4 years ago

I'm having trouble understanding exactly how disjunctive specs and conditioned witness functions (for multiple parameters) are supposed to interact.

I made a minimal example here. It's a language with only (recursive) string append, i.e. for inputs x it can generate x, Append(x, x), Append(Append(x,x),x), etc.

The master branch and rev-cond branch show the same approach but condition the witness functions in the opposite order. Based on the conditioning, it can only synthesize programs with Append recursively along one branch. The problem seems to be that the second witness function is never dispatched, and it should take a disjunctive spec as an argument instead, since that's what the first returns. But intuitively, I would think that a witness function on an example spec could automatically be mapped over a disjunctive spec; is there something unsound about this or is it just not supported?

On the disjunctified branch, I tried to use DisjunctiveExamplesSpecs instead, but with even a reduced version for debugging, PROSE invokes the first witness function in an infinite loop on the same input (and never invokes the second). I would expect this not to be possible because (as I understand it) the results of these calls are memoized. I've tried many type signatures for the second witness function, and it never seems to be invoked.

But even if it did work, I'm not sure how you would maintain dependencies across the conditioned functions. For example, if you're trying to synthesize a program for "ff" -> "fff" (which can't be represented in this language), the prefix witness function would say that the left input could be "" | "f" | "ff" | "fff", and given that disjunctive spec, the suffix witness function would say that the right input could be "fff" | "ff" | "f" | "". But now the dependencies between particular terms across those disjunctions seem to be lost. What am I missing?

Thanks

(also tagging @namin to follow this)

danpere commented 4 years ago

I'm very sorry for the slow reply. This disappeared into the depths of my inbox, so I just saw it.

  1. I haven't tried locally, but I think the very short answer is that your disjunctified branch is correct except you need to do < instead of <= on line 48 of WitnessFunctions.cs. The witness functions are not memoized: the memoization happens at the level of Specs (well, LearningTasks) mapped to VSAs (ProgramSets), which turns out to be pretty close to equivalent. The problem in this case is that your recursion doesn't have a base-case because one of the options for the prefix is the entire string, so it never completes the recursion and never actually learns a program.

  2. ExampleSpec is a subtype of DisjunctiveExamplesSpec (as it's the special case where there's only one option). And resolution of witness functions supports sub-typing as you would expect, so when trying to learn with an ExampleSpec, PROSE will try to find the most specific witness function and call one that takes a DisjunctiveExamplesSpec is that's the best that's available. (In other words, the answer to your question "But intuitively, I would think that a witness function on an example spec could automatically be mapped over a disjunctive spec" is that yes, it works how you expected.)

    Your master and rev-cond branches both have the problem that WitnessSuffix returns a DisjunctiveExamplesSpec but you don't have any witness functions that accept such a Spec. The reason you're getting the very simple programs is that the only thing that can be learned for a DisjunctiveExamplesSpec is just the input variable, since it does satisfy the DisjunctiveExamplesSpec.

  3. Without DependsOnParameters, the Specs input to a witness function are all generated by other witness functions (or the original Spec given to the learn call). With DependsOnParameters, for the depended on parameters, ExampleSpecs are generated by PROSE based on actually learning all programs for that subtree of the grammar and computing their values (obviously done in a careful way to not literally run all of the programs). Even if the relevant witness function learns an ExampleSpec, PROSE will still learn the full sub-AST first to verify there actually exists a program with that value.

gtanzer commented 4 years ago

No worries about the delay; it's been a hectic month... Thanks for the detailed response!

Re #1, I'm reading this as the witness functions need to use well-founded recursion; is this a correct understanding? Does this mean that the program space that satisfies a particular input/output spec is necessarily finite? (This seems unobjectionable in cases like append, where the extra programs just have added no-ops, but maybe not in generality.)

danpere commented 4 years ago

Yes, if you have an infinitely recursive grammar, then you have to make sure that your witness functions for that part of the grammar use well-founded recursion. Note that you can annotate a grammar rule with an annotation like @recurse[2] to limit to a finite recursion depth, which is equivalent to manually unrolling the grammar.

I think the short answer is that, yes, ProgramSets have to be finite. I know there's parts of PROSE that have support for infinite ProgramSets, but I don't think we actually use that support for anything, so I'm not sure exactly to what extent it actually works. I do know that clustering over an infinite ProgramSet is not supported, so the DependsOnParameters feature will not work as it computes a concrete list of all of the values the programs can evaluate to. Needless to say, if that set is infinite, then it won't work.

gtanzer commented 4 years ago

OK, thanks!