Open ppedrot opened 6 years ago
Looks like MutInd.make
should be made private to the kernel and not exported beyond. Don't know how to enforce this in practice though.
Looks like MutInd.make should be made private to the kernel and not exported beyond. Don't know how to enforce this in practice though.
That's a good question, it is certainly possible to do, open a bug if you think it is worth pursuing.
I think that a good first start would be to enforce the invariants statically in Kerpair, i.e. by modifying the Dual
constructor so that the following invariants hold by construction.
Invariants :
- the user and canonical kn may differ only on their [module_path],
the dirpaths and labels should be the same
- when user and canonical parts differ, we cannot be in a section
anymore, hence the dirpath must be empty
Also note that just forbidding to use MutInd.make
is not enough. You can still wreak havoc with a finely crafted substitution. It may actually be a way to do this directly in Coq, as it seems like the kernel sometimes forgets to apply substitutions in some places.
For instance, Declareops.subst_const_proj
does not apply substitution on the proj_eta
field. (Un)luckily this cannot be exploited as the only use of that field is triggered by the toplevel generation of projections, which never happens when including a module. But I'm confident that there are other places where we can leverage faulty substitutions.
I had forgotten about this one. Is anyone working on this and is it deemed necessary to fix for 8.8.1?
@Zimmi48 I don't think it's really urgent to fix it because it's only available through the ML API, but it needs to be taken into consideration at some point.
Luckily the checker does not accept the resulting vo file,
This is may have changed with the checker refactor.
@Zimmi48 I don't think it's really urgent to fix it because it's only available through the ML API, but it needs to be taken into consideration at some point.
Since Coq plugins should be outside the TCB, I feel like this should have some milestone. I tentatively picked one. EDIT:
Luckily the checker does not accept the resulting vo file
Even if that's still true, it'd probably not be good enough in practice (unless all the other checker bugs have been fixed?).
Milestoning non-blocker issues when there isn't a fix makes no sense IMO. It just makes work for the release managers who will have to postpone.
Fair and thanks for the pushback; what I did is indeed only recommended for PRs, not issues. What I meant "can you reconsider whether this should be prioritized and considered as a blocker"?
Since Coq plugins should be outside the TCB
Unfortunately that's not the case, since you can use impure code to modify kernel-facing structures. There was this funny example of an empty inductive type Faux
whose name was changed to True
, which is possible since their length coincide. Also as soon as you use unsafe OCaml features all hell breaks loose.
Ah. So a working checker would be the only recourse?
That's my understanding.
Since Coq plugins should be outside the TCB
Unfortunately that's not the case, since you can use impure code to modify kernel-facing structures. There was this funny example of an empty inductive type
Faux
whose name was changed toTrue
, which is possible since their length coincide.
Was that fixed by -safe-string
in OCaml?
Also as soon as you use unsafe OCaml features all hell breaks loose.
Sadly so. Would a better plugin mechanism (WebAssembly? Sandboxed OCaml?) be a good idea?
[This discussion probably should be split elsewhere if it extends.]
Would a better plugin mechanism (WebAssembly? Sandboxed OCaml?) be a good idea?
Maybe, but my 2 cents:
[This discussion probably should be split elsewhere if it extends.]
Agreed
Would a better plugin mechanism (WebAssembly? Sandboxed OCaml?) be a good idea?
Maybe, but my 2 cents:
- the current approach is to rely on coqchk, also according to https://www.ssi.gouv.fr/uploads/2014/11/anssi-requirements-on-the-use-of-coq-in-the-context-of-common-criteria-evaluations-v1.0-en.pdf; if coqchk fails they require restricting to plugins distributed with Coq.
Does coqchk show exactly what proof term was proved?
- coqchk is not designed to be resistant to adversarial inputs — Pervasive use of non-type-safe primitives in coqchk #7839 (comment) summarizes status and options; since then, https://metacoq.github.io/coqcoqcorrect made progress towards a verified checker in Coq.
That is good news.
- can one build an adversary-resistant plugin system without excessive performance overheads, and at what cost? Is there some intermediate level of safety that is useful? Or would focus on the checker(s) be more productive?
I am no expert on Coq, but I do know a bit about executing untrusted code :smile:.
The short answer is yes: WebAssembly is able to call into host code quite efficiently. There are a few caveats, one of which is that WebAssembly only had good support for non-GCd languages last I checked. This will change at some point, and may have already.
Does coqchk show exactly what proof term was proved?
Great point; no, and this is tricky. coqchk shows neither the statement that is proven, nor the proof term that proves it. Nobody should care about the proof term, but the statement is important.
To examine the statement proven by theorem baz
in module foo.bar
(which might use plugins), IMHO you need:
Require Import foo.bar.
and explore interactively the statement of baz
and any used definitionsNotation True := False.
to a .vo
file, without it being in the source, and the main statement is True -> P = NP
, it might be hard to notice. (A simpler variant was once exploited against https://twitter.com/proofmarket; they redefined False
to mean True
then proved the new False
— https://twitter.com/puffnfresh/status/419488475411267585).The short answer is yes: WebAssembly is able to call into host code quite efficiently.
Beyond performance, how much wrapping code is needed?
OCaml is very GC'ed, and I can't find evidence for GC and WebAsm (https://github.com/WebAssembly/gc and https://bugs.chromium.org/p/v8/issues/detail?id=7748 suggest it's WIP). Would you need to write an OCaml-specific GC backend, for the plugin to manipulate OCaml ASTs?
The short answer is yes: WebAssembly is able to call into host code quite efficiently.
Beyond performance, how much wrapping code is needed?
A lot, though most of it can (hopefully) be automatically generated.
OCaml is very GC'ed, and I can't find evidence for GC and WebAsm (https://github.com/WebAssembly/gc and https://bugs.chromium.org/p/v8/issues/detail?id=7748 suggest it's WIP). Would you need to write an OCaml-specific GC backend, for the plugin to manipulate OCaml ASTs?
My understanding is that the usual way to do this is to only expose opaque handles (perhaps stored in a hash table of some sort) and have the wasm code only work on these handles. Handles must be explicitly closed when no longer in use. In short, the answer winds up being manual memory management, sadly. That is fine if the plugins are written in Rust, where this is ideomatic and has compiler support, but less so if they are written in OCaml. Even once WebAssembly gets GC support, cleaning up guest/host reference cycles would require changes to the host OCaml runtime.
If the plugin only needs to generate ASTs (rather than interact with existing ones), another option is to serialize (using some well-defined binary format such as Cap’n Proto or protobufs) the data in the plugin, and unserialize it in Coq. That avoids the GC problems, at the expense of a serialization and deserialization step.
This discussion should by now move elsewhere, (IMHO) onto https://coq.zulipchat.com/, but I can say any sandbox should be between the kernel and the rest of Coq (per the de Bruijn criterion), as achieved by the separate checker; we have plugins customizing almost everything else, and I suspect that Coq-plugin communication is performance-critical.
And there’s no way out of checking top-level theorem statements in the output — especially since Coq cannot tell which terms are top-level theorem statements.
A separate checker needs a serialization format — alternative ones might be desirable, but one needs to preserve sharing in terms (as mentioned in https://github.com/coq/coq/issues/7839#issuecomment-397816096), or risk exponential blowups in space complexity.
This is a soundness bug, but I don't know how to trigger it without going through (deemed safe) ML code. Maybe @maximedenes can find a way to weaponize this bug in pure Coq?
Essentially, the bug is that the kernel is inconsistent when it comes to handling canonical vs. user names. A few functions are relying on canonical names, most specifically conversion, but many others are not, like e.g. environment accessing functions. And the kernel never checks that canonical vs. user parts are consistent in the environment.
The bug goes as follows. Using an ad-hoc ML tactic, we craft a nonsense inductive type that is
False
on its user part andTrue
on its canonical part.Then we profit.
Luckily the checker does not accept the resulting vo file, but then it seems to be actually due to a bug in the checker itself, similar to #7329. Indeed, contrarily to constants, inductive types need to be compared on their canonical part in reduction, as they have no body. This must be fixed there, but then we need to be careful that the two sides of inductive names agree.
TL;DR: kernel aliasing is a can of worms.