c-cube / frog-utils

[frozen] Scheduling and running jobs on a shared computer, then analyse their output
BSD 2-Clause "Simplified" License
5 stars 0 forks source link

Check proofs #22

Open Gbury opened 9 years ago

Gbury commented 9 years ago

Most, or at least some theorem provers include backends that output proofs in some format. It would be nice if frogtptp and/or frogmap was able to produce these proofs and check them with the appropriate tool when doing benchmarks. This raises some interesting questions.

First is more of a philosophical question: producing a proof can be time and memory consuming compared to simply searching for one, which means that asking the prover to produce a proof may alter in significant ways the result of a benchmark (i.e time and/or memory used, etc..). A simple solution to this would be to call the prover twice: once for the benchmark purposes, and a second time in order for it to produce the proof. However this might make benchmarks a lot longer. We could also consider that outputting the proof is part of the benchmark, but in that case, for benchmarks of different provers to be comparable, they should all produce proofs in the same format, which seems highly unlikely right now.

The second question is a design implementation: since obviously the checking of the proof is not part of the benchmark of a theorem prover, the proof checker's process statistics such as time used should be recorded apart from those of the theorem prover, which is not currently possible with a single use of frogmap. A possible solution would be to extend frogmap to map not only a single command, but a chain of commands, which would allow to record their execution time separately. That would require a serious modification of the existing code of frogmap and most likely introduce non-backward compatible changes.

It could be possible to add another tool (such as frogbench), which would execute frogmap to benchmark and produce proofs, and then another frogmap to check these proofs. However this would separate the results of the bench from the results of the proof checking, which is not very nice I find.

Any comments ?

c-cube commented 7 years ago

Would be useful for provers that output Dedukti, or SMT solvers. Must be optional though, as it does not make sense for my use case(s). Good luck! :)