Auto-Scheduling without Re-running the Query Optimizer

Right now, the auto-scheduler can be used with the interpreter as follows:

souffle <program> -p <profile> --emit-statistics souffle <program> --auto-schedule=<profile>

The first command generates a profile with index selectivity statistics.

The second command reads the profile with statistics, runs the query optimizer to find schedules and runs with those schedules.

If we want to re-run the interpreter with the same schedules (say for rapid prototyping), the user will re-run the second command:

souffle <program> --auto-schedule=<profile>

But this will re-run the query optimizer redundantly, slowing down the rapid prototyping cycle.

To fix this, we want the auto-scheduler to cache the generated schedules.

We can do this by emitting plan statements and saving them to a .plan file.

Then when re-running with the same schedules, they can provide the .plan file as a command-line argument, and Soufflé will use those schedules.

I imagine it would be used as follows:

souffle <program> -p <profile> --emit-statistics souffle <program> --auto-schedule=<profile> --emit-schedules=<plan file> souffle <program> --use-schedule=<plan file>

souffle-lang / souffle

Auto-Scheduling without Re-running the Query Optimizer #2247