Closed s-zanella closed 5 years ago
Thanks for the minimal repro, @s-zanella ... much appreciated.
Note, the problem can be seen independently of Kremlin. If you extract Test.fst
to Test.ml
you can see
let (outside : unit -> Prims.bool) =
fun uu____15 ->
M.f (Obj.magic (FStar_UInt8.uint_to_t (Prims.parse_int "0")))
The additional magic
there is the problem.
The root cause of this bug has to do with memoizing extraction: See https://github.com/FStarLang/FStar/issues/1444 for more context.
Since about a2d881f3aa6ba11b6241716fcef5e669d211b510, the main control flow of the compiler is to typecheck a module, then to extract an ML interface for it (conceptually, an .mli
), and to write both to a .checked file. The .mli component of a .checked file is then used to extract other .checked files either to .ml or to .krml.
This memoization of .mli helps is avoiding the quadractic behavior of re-extracting all the dependences of a module.
However, this memoization interacts poorly with --cmi. The .mli is extracted with respect to the "normal" dependences of a module: in the example above, this means that M.mli contains
val f : unit L.uint_t -> Prims.bool
Notice that L.uint_t
was not inlined, since at .mli extraction time, the implementation module L.fst
is not in scope.
However, later, when extracting Test.ml
, L.fst
is in scope (because of --cmi) and now, we expect M.f
to have type UInt8.t -> ...
, and from the perspective of the ML extraction, UInt8.t
and unit L.uint_t
are not compatible. And it's not possible for the ML extractor to reduce unit L.uint_t
to UInt8.t
, since L.uint_t
is extracted simply to
type 'Auu___54_25 uint_t = Obj.t
We should not extract .mli's without first inlining across module boundaries.
This means that we cannot extract .mli's immediately after typechecking a module, since the needed module implementations are not in scope.
The proposed solution is in a few steps:
Remove the memoization of .mli from .checked files.
Re-extract .mli's for the context on each run of extraction. That should fix this bug, but will slow down extraction: we should measure the slow down.
If there is a significant slowdown, then we can add a new kind of binary file, .A.fst.mli.checked (or something like that) and memoize the .mli extraction from step 1 in that file.
I'm starting on points 0 and 1 now. Comments welcome.
Thanks to @protz for some discussion about this.
Ping @aseemr
0 & 1 are now implemented in the nik_dep
branch.
As for the performance impact of not memoizing .mli: A full sequential bootstrapping run with memoization took 9min 3sec; without memoization it took 11min 6sec.
So, there's a non-trivial cost to this suggesting that step 2 is worthwhile doing.
Then again, with -j 20
on my machine with has plenty of cores, the difference in bootstrapping time is:
With memoization: 1min 14sec Without memoization: 1min 24sec
I'm not sure the additional complexity of 3rd kind of auto-generated file (in addition to .hints and .checked), plus the complexity of new build rules and dependence analysis for those files is worth it.
WDYT? @protz?
I would prefer not adding a new cache file. As you point out, it is additional complexity in the dependence analysis (which is already quite complex), mechanism and logic to detect if the cache files are stale, keep them in sync, etc.
But can we retain incrementality of extraction without them?
Suppose we write the extracted .mli
part of the .checked
files at the time of extraction, rather than just after typechecking. When extracting B.fst
which depends on A.fst
, we load A.fst.checked
(as we do today), if the .mli
part is present, we use it, else we extract and write to it. We will also have to store the flags such as --cmi
so that this cache is used later in the same context.
One issue with this, in the current setting, is that it would invalidate C.fst.checked
, when C.fst
also depends on A.fst
, because C.fst.checked
contains the hash of A.fst.checked
and not of A.fst
. For that, can we make it so that we don't hash the checked files themselves, but only hash the source files?
But this also doesn't seem worth it to me. Can we also benchmark an Everest run to see if there is any noticeable performance hit there?
I'd also prefer not to add a new kind of binary cache file, or to repurpose .checked
files to incrementally memoize extraction information. Dependency analysis and build rules are already complex as they are.
A worst case 25% slowdown sounds reasonable.
Here's another data point that doesn't show any ~significant~ unbearable slowdown in parallel builds:
Building F* from scratch in nik_dep
(time OTHERFLAGS="--admit_smt_queries true" ./everest -j 10 FStar make
): 10m7.611s
Building Frodo with nik_dep
(time OTHERFLAGS="--admit_smt_queries true" make -j 10 -C frodo/code
): ~2m45.263s~ 3m28.783s
Building F* from scratch in master
(time OTHERFLAGS="--admit_smt_queries true" ./everest -j 10 FStar make
): 10m10.751s
Building Frodo using master
(time OTHERFLAGS="--admit_smt_queries true" make -j 10 -C frodo/code
): 2m34.794s
Edit: did another measurement and it showed a 35% slowdown in Frodo. I'm now unsure my first measurement was using the correct FStar branch.
During KreMLin extraction when using
--cmi
, sometimes F* inserts unnecessary casts in the generated AST. When processed with KreMLin, these casts translate to casts to incompatible types (e.g.(void **)
when the expected type isuint8_t *
, or(void *)
when the expected type isuint8_t
) and a C compiler rightfully complains about them.Starting from
code/sha3
inhacl-star/_dev_frodo
that exhibits this behaviour, I minimized the issue to these 4 files:L.fsti
inline_for_extraction val uint_t: inttype -> Type0
inline_for_extraction type uint8 = uint_t U8
inline_for_extraction val u8: (n:nat{n < 256}) -> uint8
M.fst
open FStar.HyperStack.ST open L
let f (b:uint8) : St unit = ()
let inside () : St unit = f (u8 0)
The expected result is that
M.inside
andTest.outside
extract the same to C, but they don't:The symptom is that when trying to compile the extracted C, the compiler errors:
Passing
-dast
to KreMLin shows an extra cast inTest.outside
that is absent in the call toM.f
inM.inside
:M.f (0uint8<: L_uint_t ())
.It's crucial that
L.uint_t
is amatch
. Defining it instead aslet uint_t _ = UInt8.t
works around the issue, but of course can't be done when there's more than one case. Another workaround is to markM.f
asinline_for_extraction
, but this is not always desirable.An unsatisfactory workaround is to compile with
-Wno-incompatible-pointer-types -Wno-int-conversion -Wno-int-to-pointer-cast
, but who knows what the C compiler is allowed to do when these warnings are triggered.I'm attaching the 4 files as well as an incremental Makefile to reproduce the issue: casts.tar.gz
I'm setting the priority to fix this to high because using
Lib.IntTypes
inhacl-star
triggers the issue and so it will show everywhere inhacl-star
when using--cmi
.