Verified extraction to OCaml

palmskog commented 5 years ago

To verify extraction to OCaml, one needs a semantics of OCaml in Coq (which may or may not be verified itself in some way). This is an attempt to document relevant resources to make this happen.

One early effort on formalizing OCaml semantics is OCaml Light by Scott Owens and his collaborators. The language and semantics is encoded in the Ott language and then exported to HOL4, Isabelle/HOL, and Coq. The soundness proof of the type system is only in HOL4, however. A paper describing the formalization and proofs was published in ESOP 2008. All files are available in the Ott repository on GitHub. The repository version of Ott can produce a definition of OCaml Light that compiles with recent Coq.

While there are some related efforts to formalize OCaml in different proof assistants, as discussed here, OCaml Light seems to me to still be the most realistic and relevant one for verifying extraction in MetaCoq.

yforster commented 5 years ago

Would type soundness be needed for the correctness of extraction? Probably everything you would need is a definition of the OCAML light type system and a definition of evaluation on OCAML light, plus potentially some minor convenience Lemmas about evaluation.

But I don't think type safety would actually be needed to prove a simulation result between erased Coq terms and OCAML.

yforster commented 5 years ago

Ah, and as @mattam82 has mentioned before, OCaml light probably does not cover Obj.magic, which is inserted a lot by the extraction at points where the erased Coq terms have boxes.

I have been thinking about extraction to cbv lambda-calculus, encoding recursion using fixed-point combinators and inductives using scott encodings (as we do it in the ITP paper). In untyped lambda-calculus, you can implement Obj.magic again using a FPC. Probably the same trick can be used for untyped OCaml, but to get the extraction type-check we'd have to extend OCaml light by Obj.magic (which might still not be very hard).

palmskog commented 5 years ago

I think you're right that having the type soundness proof in Coq is not strictly necessary for doing verified extraction. In theory, the only thing we want is that the OCaml Light semantics is consistent with what is actually implemented in a recent OCaml compiler, which is justified in the paper and could also be justified by differential testing. Nevertheless, it could be a good student project to translate the OCaml Light type soundness proof from HOL4 to Coq, for example, using CoqHammer to do the translation at the high-level argument level.

Right, you need Obj.magic also, and hopefully it's not too hard to add. But a first step might be to see if the OCaml Light semantics could be used to verify extraction for some really restricted subset of Gallina that never requires Obj.magic?

palmskog commented 5 years ago

There is now a standalone repository that can generate the OCaml Light definitions that works in Coq 8.8: https://github.com/palmskog/ocaml-light-extraction

@mattam82 can I add extra-dev OPAM packages for MetaCoq so I can use them in CI for MetaCoq-dependent projects like this one?

mattam82 commented 5 years ago

Sure! Do you think the current .opam files will work easily? I wonder what's the most efficient way to do that.

mattam82 commented 5 years ago

About the changes to erasure that would be necessary to target ocamllight: certainly quite a few Obj.magics would be needed for dependently-typed code. About linking to CertiCoq, we discussed a bit with Andrew Appel and Aquinas Hobor, the first obstacle we see would be updating the garbage collector to handle refs. The purely functional character of Coq helps a lot having an efficient GC that does not require any operation at deallocation time for example.

palmskog commented 5 years ago

I submitted the OPAM package definitions based on the .opam files, and they seem to work as intended.

The initial workflow I had in mind for OCaml Light extraction was actually to completely elide CertiCoq and output the verified OCaml code in the simplest way possible (e.g., printing it from inside Coq). This has the advantage that it can be applied directly to lots of our existing OCaml-based extraction projects for testing, although the OCaml compiler of course remains in the TCB.

With respect to Obj.magic and OCaml Light extraction, one plan could be as follows:

set up boilerplate for MetaCoq+OCaml Light extraction correctness (for some tiny language fragment?)
prove correctness (and output well-typing) for "typedef" OCaml Light (no modules) assuming input Coq code never needs Obj.magic or modules (is not dependently-typed)
extend OCaml Light definition and type system with Obj.magic
extend correctness proof to dependently-typed Coq code
extend correctness proof to "modules+typedef" OCaml Light for Coq code with dependent types and (non-parameterized?) modules

annenkov commented 4 years ago

Hi guys, I've been looking in the same direction. It looks like if we want to integrate this with the certified erasure, we need, probably, to do it in a similar way, as it described in Pierre Letouzey's thesis: implement a modified version of the type inference algorithm (Pierre's proposal is to use algorithm M) to recover types from the CIC_box. I am currently taking a bit different route: remove redundant boxes after erasure and pretty print OCaml-like code (I target smart contract languages). It is safe to remove boxes if the constants and constructors are applied to all logical arguments (either coming from proofs or types). I've implemented a version of erasure for types to OCaml-like types and validation procedures that check if constants and constructors are applied enough. It is available in my fork: https://github.com/annenkov/template-coq/blob/coq-8.11-erase-annotated/erasure/theories/Debox.v I also added some information to binders, so one can know if the binder is a "dummy" abstraction or let-in and can be potentially removed.

I have a working pretty-printer that prints Liquidity code (basically, some sort of restricted OCaml): https://github.com/AU-COBRA/ConCert/blob/extract-cert/extraction/theories/LPretty.v In fact, it can handle refinement types: https://github.com/AU-COBRA/ConCert/blob/extract-cert/extraction/examples/CounterRefinementTypes.v

Maybe we can accommodate some of that stuff for OCaml Light.

MetaCoq / metacoq

Verified extraction to OCaml #163