emina / rosette

The Rosette solver-aided host language, sample solver-aided DSLs, and demos
Other
640 stars 74 forks source link

Optimize Query in Rosette with Externally invoked Objective function #234

Closed RafaeNoor closed 2 years ago

RafaeNoor commented 2 years ago

Hello!

We are using Rosette for our research work, and as part of that we have been using the optimize feature in Rosette for the Z3 solver and providing an objective function. For other solvers where direct optimization is not supported (e.g. Boolector) we iterate by find some sample solutions loss value and then setting that as the ceiling of the loss in the next iteration until no solution exists with a lower loss.

For both cases, the objective functions are completely defined in Rosette and they work as expected. I wanted to know if it is possible to have the objective function evaluated as an external function invocation (note this would be some loss function not expressed in Rosette) and returns some loss value opaquely?

I've tried to implement this and testing with both solvers and optimization strategies, it does work towards reducing the loss but it is not the lowest possible loss which we know exists (and is reached when the objective function is defined in rosette). Additionally, executing the same query introduces non-determinism which wasn't happening when the objective function is defined entirely in rosette.

Does Rosette support this type of behaviour? Or do you think it's possibly behaving randomly here?

emina commented 2 years ago

Can you clarify what you mean by external function?

If the question is whether it is possible to obtain a concrete solution, apply an unlifted Racket function to the concrete values within the solution, and then use the concrete result in another round of iteration, the answer is “yes”.

This has to be done carefully, or you can get unexpected behaviors. For example, you need to be sure that you are calling the external function on completely concrete values and that the external function is a pure function (so free of any side-effects).

RafaeNoor commented 2 years ago

In the workflow I’m considering, the objective function would work as follows.

In the synthesis query, we would add the assert that (cost grammar) < VALUE.

The cost function here would take the concretized expression of grammar and write it to a text files. Then from within Rosette, we would invoke a process to run a Python script which parses the contents of that text files and returns a constant integer value (hence there is file IO being done). The returned constant integer value will be used as the value of the objective function. By external I meant the Python script here being run.

emina commented 2 years ago

It sounds like there is some issue with the Python pipeline that is introducing non-determinism. The IO alone should be irrelevant here.

emina commented 2 years ago

To be clear, I’m assuming here that you are using an external function to compute VALUE, and then asserting to Rosette that (cost grammar) < VALUE, where VALUE is a concrete number. If you’re somehow interleaving Rosette and Python computation (?), that could be source of the problem.

RafaeNoor commented 2 years ago

I’m trying to do the latter, VALUE is a constant value (for example 50), The function invocation (cost grammar) is like you said interleaving rosette and Python code. It is basically a wrapper over the system command in racket .

jamesbornholt commented 2 years ago

Unfortunately, that won't work. For the solver to find the minimal cost, it needs to know the definition of the cost function. Otherwise, it has no way of knowing whether a program of cost k exists for some arbitrary k, and therefore no way of knowing whether cheaper solutions exist once it finds one solution.

Here's one way to look at it: suppose that we happen to know the optimal program has cost 10. While running your algorithm, the solver finds a solution, invokes the external program to compute its cost, and you get back the cost 10. So you add an assertion of (< (cost grammar) 10). The problem is now unsatisfiable because 10 is the optimal cost. But the solver now has to prove there is no cheaper program. It can't construct such a proof without knowing something about the definition of the cost function.

The non-determinism you're seeing is probably because the (< (cost grammar) value) assertion isn't doing what you intend—it would be worth examining exactly what that constraint evaluates to on some simple examples to make sure it's doing the right thing.

emina commented 2 years ago

To follow up on James’ comment: the cost function itself must be lifted (encoded in Rosette) for this approach to work. (My understanding was that you were using Python to compute the VALUE integer, which would have been okay.)