Support calls to foreign functions

atomb commented 3 years ago

Allowing Cryptol programs to reference external functions, written in other languages, could have several benefits:

It could allow users to write high-performance primitives to make the interpreter faster or so that it's unnecessary to rewrite existing libraries in Cryptol.
It could allow Cryptol code compiled to software to reference existing interfaces.
It could do similarly for Cryptol code compiled to hardware.
It could allow use of special decision procedures within provers to efficiently handle constructs outside of standards like SMT-Lib.

It could be possible to do this with very little change to the language. The notion of a primitive in Cryptol, as it exists, is close to what we need. Ultimately, this construct currently means that any prover, interpreter, or compiler needs to know how to model, execute, or generate code for that function without (necessarily) having access to Cryptol source code for it.

Given that users can already declare primitives within any source file, the missing piece seems to be having a way to map primitives to external references. Some programming languages with foreign function interfaces have very flexible mechanisms allowing users to specify how external code is called, how parameters are passed, and so on. In the name of simplicity, it may be better (at least at first, potentially permanently) to simply state what the calling convention for external code should be.

If we do that, then the only remaining piece is a way for users to specify where an external symbol is located. For the four use cases listed above, here are some possibilities:

Specify an external shared library file containing the symbol. Assume that the library has a symbol with the same name that has a type compatible with the calling convention we would use when generating C code in (2) below.
Compile the Cryptol code to a software implementation that includes a call to a function with the same name as the primitive, using some specified calling convention. No extra user input is needed until the linking stage.
Similar to (2).
Translate the declared primitive to a symbol with the same name in the generated formula.

One potential issue here would be the accidental use of primitives. Currently, trying to :prove a formula that references a user-declared primitive that the symbolic backend doesn't know about yields an "unimplemented primitive" message. It might be less user-friendly to instead get an error back from the solver that it doesn't understand the formula Cryptol has sent it. Perhaps an additional marker, such as an external primitive or even just external could make this more explicit?

For case (1), one remaining issue is to tell the interpreter where to find the shared library containing the code to execute. We could follow Cryptol 1 by adding syntax to specify this within a Cryptol source file. Or we could allow the interpreter to read some sort of separate "map" file that tells it where to find primitives it doesn't have built-in support for.

robdockins commented 3 years ago

From a language design point of view, I think this will be pretty easy actually. Our internal AST already has places to pipe in "pragmas" that can be attached to bindings. It's then just a matter of deciding what the syntax should be. I suggest something like the following:

external asdf "ccall" "c_symbol_asdf"
external asdf "java" "package.nonsense.JavaClass.asdf"

Then the interpreter and any code-gen backends just have to know which FFI conventions to pay attention to and how to interpret them.

abakst commented 3 years ago

A minor point: we'll still need a way of annotating the type for the symbol. I expect this is essentially the same as writing primitive, but it's probably a good idea to keep these sorts of symbols distinct (or make Cryptol primitives a special case?)

external foo : t1 -> t2

More importantly, considering the above example, some backends might need/want additional information, so it should be possible to apply multiple annotations. Presumably these don't need too much structure, and the responsibility for interpreting the annotation belongs to the interpreter.

Taking hardware as an example, I'd think we might want to tell the translator: (1) what the name of the module is, (2) what the names of the input/output signals are

abakst commented 3 years ago

I wonder if this just boils down to exposing the ability to annotate declarations? In the case of translator-specific foreign definitions, I'm not sure that the annotations really have no meaning to Cryptol itself, except to say that there may not be an implementation.

robdockins commented 3 years ago

We won't need to annotate the type, the typechecker will figure it out for us. The idea is that external is a declaration that lives alongside some other declaration that brings the symbol into scope, wether that be a definition or a primitive declaration. Additionally, you could have several external declarations for a single symbol.

robdockins commented 3 years ago

E.g. you might write

primitive asdf : [256] -> [256]
external asdf "ccall" "c_asdf_impl"

foo : [5][16] -> [16]
foo xs = xs@0

external foo "ccall" "foo_impl"

I don't know what we would want to do with polymorphic things, binding those to external stuff seems much harder.

dmwit commented 3 years ago

One thing that may be worth thinking about is how/whether to handle polymorphic primitives. Maybe for some languages it doesn't make sense... but then again maybe there's some suitable representation of cryptol types we could create in that language to supply as a parameter and cause it to make sense.

For the "allow users to create high-performance primitives" one in particular, I could easily imagine having Haskell as a choice of implementation language, and it seems completely reasonable to allow such primitives to be polymorphic.

yav commented 3 years ago

This would probably be part of the calling convention. For example, I could imagine a calling convention where numeric type parameters are mapped to some kind of number when calling C.

atomb commented 3 years ago

One consideration might be whether we want to be able to have one external reference used in multiple ways. Say we have a signature algorithm that builds on SHA3. In the interpreter, we may want to call out to an external implementation in a specific shared library. When compiling to C, we may want to include a call to some existing C source code. And we may want to be able to do both from the same source file.

What this suggests to me is that the declaration "this thing is external" and the metadata "here's where to find it" are somewhat separate and it may make sense to put them in different places. The "this is external" aspect maybe belongs in the source code but the "here's where to find it" part may make sense as separate input to the interpreter/compiler/whatever.

Another thing that might simplify things: because we're talking about assuming a calling convention that we specify, might it also be sensible to assume that the external function has the same name that it does internally in Cryptol? So external asdf "ccall" "c_symbol_asdf" could instead be external asdf "ccall"? Or even just external asdf? Given that we'll likely have to write wrappers around existing code to satisfy our calling conventions, making sure those wrappers have the desired names seems easy enough.

dmwit commented 3 years ago

@atomb Perhaps you could simply allow multiple external declarations for the same symbol? So if you want an interpreter extension and a C compilation extension, you'd write

external asdf "ccall"
external asdf "interpreter"

or something like that.

RyanGlScott commented 2 years ago

See #1376 for an implementation of this idea. Some interesting challenges have arisen while prototyping this feature regarding how we should link against the libffi C library, which #1376 uses to interface with C code. These challenges can broadly be categorized into two categories:

GHC ships its own version of libffi, which clashes with the system libffi.
libffi isn't as easy to find on Windows than with other operating systems.

First, let me explain each issue in more detail:

Most of the time, GHC uses dependencies that come with the operating system, such as gcc and ld. One notable exception to this rule is libffi, which GHC ships its own custom version of. (See here.) GHC does this because the upstream libffi repo is notoriously slow to make new releases, and there are some libffi bugfixes that have yet to appear in an official libffi release. These bugfixes are usually for architectures like ARM and PowerPC.

Shipping a custom version of libffi works well for GHC's needs, but it unfortunately complicates the story for Haskell libraries that also want to link against libffi. This is because when GHC links a binary, it will always put its custom version of libffi at the front of the linker search path. This effectively preventing you from linking against the system-provided version of libffi that comes with your OS. Note that this only applies to linking code, however. When you run the compiled binary, it will load the libffi shared library that your OS provides.

What exactly goes wrong with this setup? If you are using both a recent OS and a recent version of GHC, there is a good chance that nothing will go wrong. This is because it is likely that the version of libffi that GHC ships will match the version of libffi that the OS provides. As a result, when you run a Haskell binary that dynamically links against libffi, the loader will find your OS's version of the libffi shared library. Moreover, because this version of libffi matches the version that GHC ships, everything will just happen to work out. (The differences in behavior between GHC's libffi and the system libffi tend to be negligible on most common OSes.)

Where things do go wrong is if these two libffis happen to have different versions. A concrete example where this arises is when using GHC 8.10.7 on Ubuntu 18.04. GHC 8.10.7 ships with libffi7, but Ubuntu 18.04's package manager provides libffi6. As a result, if you use GHC to link against libffi on this GHC/OS combination, the compiled binary will be linked against GHC's libffi7. When you try to run the binary, however, it will fail, since the system libffi has a different version:
```
/home/runner/work/cryptol/cryptol/dist-newstyle/build/x86_64-linux/ghc-8.8.4/cryptol-2.13.0.99/x/cryptol/build/cryptol/cryptol: error while loading shared libraries: libffi.so.7: cannot open shared object file: No such file or directory
```
It is worth noting that GHC 9.4+ will ship with libffi8, so this version mismatch will likely arise with GHC 9.4 + Ubuntu 20.04 (and possibly other OSes) in the future.
Linux and macOS make it relatively straightforward to install pre-built libffi libraries through their respective package managers. Then there is Windows, which doesn't have a de jure package manager for libraries like libffi. The MSYS2 project is arguably the de facto package manager for C libraries, however, and MSYS2 does provide a native Windows build of libffi. There is a catch, however: MSYS2 uses pacman as a package manager, which uses a rolling release model for versioning. As a consequence, the ABI of MSYS2-provided packages are liable to change without notice, and this has adversely affected GHC on Windows in the past. (See here for a recent example.)

How should we address each of these problems?

While GHC does offer a configure-time option for using the system libffi instead of bundling its own, this requires building all of GHC from source, which isn't really feasible for CI purposes. Therefore, I think there are really only two realistic paths forward:

a. Build our own, statically linked version of libffi and link Cryptol against that. b. Copy GHC's bundled libffi shared library to the directory where Cryptol is being built and link against it using -Wl,-rpath,$ORIGIN. When creating the bindist, include both the Cryptol binary and the libffi shared library in the same directory so that running Cryptol will load the libffi shared library in the same directory.

Each of these approaches have their pros and cons. Option (b) would perhaps be the least amount of work, since we wouldn't have to figure out how to build libffi from source (with static linking, no less). On the other hand, option (b) would require some tweaks to the way bindists are created, whereas option (a) would not.

Disclaimer: I have not attempted to implement either solution in their full generality, so there could be unforeseen obstacles that aren't yet apparent. It is simple enough to try a local version of option (b). For instance, given this C file:
```
// ffi.c
#include <stdio.h>
#include <ffi.h>

void libffi_ex(void)
{
 ffi_cif cif;
 ffi_type *args[1];
 void *values[1];
 char *s;
 ffi_arg rc;

 /* Initialize the argument info vectors */
 args[0] = &ffi_type_pointer;
 values[0] = &s;

 /* Initialize the cif */
 if (ffi_prep_cif(&cif, FFI_DEFAULT_ABI, 1,
                      &ffi_type_sint, args) == FFI_OK)
   {
     s = "Hello World!";
     ffi_call(&cif, puts, &rc, values);
     /* rc now holds the result of the call to puts */

     /* values holds a pointer to the function's arg, so to
        call puts() again all we need to do is change the
        value of s */
     s = "This is cool!";
     ffi_call(&cif, puts, &rc, values);
   }
}
```
And this Haskell file:
```
-- FFI.hs
{-# LANGUAGE ForeignFunctionInterface #-}
module Main (main) where

foreign import ccall "libffi_ex" c_libffi_ex :: IO ()

main :: IO ()
main = c_libffi_ex
```
You can do a "test run" of this RPATH trick like so (on Ubuntu 20.04):
```
$ ghc FFI.hs ffi.c '-optl-Wl,-rpath,$ORIGIN' -lffi
Linking FFI ...
$ ./FFI
Hello World!
This is cool!
$ ldd FFI
       linux-vdso.so.1 (0x00007ffd02dc7000)
       libffi.so.7 => /home/rscott/Documents/Hacking/C/./libffi.so.7 (0x00007f9f756b3000)
       <snip>
```
You'll know if the trick works by inspecting the ldd output, which should show a reference to the libffi.so.7 file that we copied over. We would need to include this copied libffi shared library in the bindist as well.
What about Windows? Unfortunately, GHC Windows bindists don't appear to include libffi.dll files. (Perhaps Windows GHCs link against libffi statically?) Moreover, it seems unwise to bundle pre-built MSYS2 libraries from pacman given that they do not have a stable ABI. As a result, I think our only real option here is to build libffi from source. Here, we have a choice of whether we would want to build libffi dynamically or statically—I don't have a sense for which is easier to accomplish on Windows.

Given that we would have to build libffi from source on Windows, it might be worth trying to build libffi statically on all supported OSes. That would allow us to have relatively consistent libffi packaging across all OSes.

qsctr commented 2 years ago

I came across this issue some time after I had already begun working on the FFI, so the currently implemented syntax is a little different than what is proposed in this issue. But right now in #1376 I have the following implemented: given a file Foo.cry with

foreign func : [64] -> [64]

when the module is loaded, Cryptol will look for a file named Foo.so (Foo.dylib on macOS, windows not supported yet) and dynamically load a symbol with the same name func, using the dlopen/dlsym APIs. Then when the function is called in Cryptol, the interpreter will call func from the shared library, performing marshalling between Cryptol and C types for the argument and return value. We are able to do this call without knowing the types at compile time by using the libffi library. Arguments are always fully evaluated before the foreign function is called.

Currently the only type that is supported is [64] which maps to a 64-bit unsigned integer, so the C source could be something like

unsigned long func(unsigned long x) {
  return x + 1;
}

The tentative plans for support for more types are, after discussions with @yav:

For [n] where n <= 64: Round up to the nearest integral type (e.g. [5] would be represented as uint8, [30] as uint32, etc), where we just fill in empty bits in front.
For [n] where n > 64: Represent it as an array of 64-bit words, again filling in empty bits when necessary, and pass a pointer.
For [n]t where t is not Bit: Represent as array of whatever the representation of t is.

There is also the question of how ownership of memory is handled when using arrays. @yav came up with two choices:

The Haskell/Cryptol side owns all memory. It will free any arrays that it allocated to pass to the C side, and free any pointer that the C side returns.
Ownership is transferred when passing pointers, so the C side owns the Haskell-allocated memory when the function is called, and the Haskell side owns whatever is returned by C. The advantage of this is we can implement algorithms involving in-place mutation in C by just modifying the data and then returning it again. The disadvantage is that now the C side is responsible for freeing anything that it is passed and does not plan on returning.

I also think that there could be a third option:

If the return type is large (involves allocating memory), the C function then takes an extra argument which is a pointer to some Haskell-allocated memory and returns void. The C function will then fill in the memory and that will be treated as the return value. This lets Haskell do all the memory management but we still have to copy some data if we have an in-place algorithm (I'm not sure how often this would actually happen, since I'm not very sure of the envisioned FFI use case).

Finally we probably want to also support sequences of non-fixed length. The issue here is that in Cryptol the length is encoded as a type parameter while in C it is usually a value parameter passed with the pointer. So I was thinking that we can just insert the type parameters as value parameters, so something like foreign f : {n, m} [n] -> [m] would translate to uint64 *f(size_t n, size_t m, uint64 *x).

I created a github "project" at https://github.com/GaloisInc/cryptol/projects/1 where I'm tracking specific things I'm working on at the moment related to the FFI if anyone is curious.

yav commented 2 years ago

@qsctr @robdockins I was thinking a bit about what types should be supported by the FFI and their representation and came up with the following.

Representtaion

Cryptol                           C
----------------------            --------
finite numeric type               size_t

Bit                               bool
Float32                           float
Float64                           double
[K] Bit, 0  <= K <= 8             uint8_t
[K] Bit, 8  <  K <= 16            uint16_t
[K] Bit, 16 <  K <= 32            uint32_t
[K] Bit, 32 <  K <= 64            uint64_t

fin n, [n]Float32                 *float
fin n, [n]Float64                 *double
fin n, [n][K] Bit, 0  <= K <= 8   *uint8_t
fin n, [n][K] Bit, 8  <  K <= 16  *uint16_t
fin n, [n][K] Bit, 16 <  K <= 32  *uint32_t
fin n, [n][K] Bit, 32 <  K <= 64  *uint64_t

Arguments of tuples/structs types are passed into function as multiple arguments, one for each field. Thus, the fields of structs/tuples need to fit in one of the above types.

We have to be careful to pass the fields of a struct in the correct order, I believe we do have this information available.

Results of tuple/struct types are retruend into multiple output arguments.

Ownership

I am still not quite decided on what policy to use for passing arrays. The 2 choices I was thinking of correspond to bullet 2 and 3 in @qsctr's description above:

The receiver owns the arrays, meaning they have to deallocate it and the sender should not read from it. Pros: (1) allows us to implement in-place mutation in the C code for functions that take in an array and return an array of the same size; (2) arrays are manipulated uniformly, just like any other value.
Cryptol manages all the memory, passing in buffers for result arrays. Pros: (1) makes the C code easier to write; (2) allows Cryptol to stack allocate everything (e.g., using alloca)

The choice of ownership policy has an effect on how C functions should return results:

Arrays are returned as ordinary values. We use output parameters only for tuples/structs. An output parameters is always a pointer to the type used for the input parameter.
Arrays are always returnd using output parameters. Output parameters for arrays do not need an extra pointer (i.e., they have the same type as the input parameters).

Examples:

f : {m,n} (fin m, fin n) => [17] -> [m+n][8] -> [n][8]

g : {m} ([m][8], [64]) -> ([m][8], [64])

corresponds to a C function of type with ownership schema (1):

uint8_t* f(size_t m, size_t n, uint32_t in1, uint8_t* in2)

void g(size_t m, uint8_t *in1, uint64_t in2, uint8_t **out1, uint64_t *out2)

corresponds to a C function of type with ownership schema (2):

void f(size_t m, size_t n, uint32_t in1, uint8_t* in2, uint8_t *out1)

void g(size_t m, uint8_t *in1, uint64_t in2, uint8_t *out1, uint64_t *out2)

I am leaning slightly towards ownership schema 2, but could go either way. Thoughts?

qsctr commented 2 years ago

I am leaning towards 2 as well, since it feels simpler overall. The non-uniform behavior for arrays is unfortunate but I guess just a consequence of how C handles passing arrays differently compared to other values.

For the representation of [K]Bit where K > 64, should it be uint64_t*? Also what is the reasoning for flattening structs/tuples into arguments instead of being a pointer to some memory holding the fields? Is it because of possible alignment issues? And are we supporting nested sequences/tuples/structs? I don't see any reason why we can't, since we could just flatten everything into one big piece of contiguous memory that is passed around.

weaversa commented 2 years ago

Another option is to provide an interface that programmers must target. Here's a pretty simple example for bitvector types: https://github.com/weaversa/bitvector (as well as some sequences). For Integers and the Z type, you could enforce the use of a math library such as GMP.

The same could be done for languages other than C -- this is how the old cryptol java code generator worked - it provided it's own interface it would compile to, rather than going to native Java types. (for those w/ access, see: https://gitlab-ext.galois.com/cryptol/cryptol-codegen/-/blob/master/java-support/BitVector.java). I can also imagine targeting hacspec for Rust.

Yet another alternative is to follow the semantics of SAW. In the Python interface (at least) cryptol in cry_f functions maps to LLVM data structures -- bit-vectors, sequences, tuples, and records are all handled nicely.

yav commented 2 years ago

The types I listed are the types that map, more or less, 1-1 to C types.

There are various ways to support more types, but I think it might be better to handle those by writing a bit of Cryptol code to translate to/from the basic types, as this would make the required representation explicit.

For example, if I want to pass in [1024], I can write some Cryptol code to turn this into [128][8], which would force me to be explicit if I want this to be represented using little- or big-endian byte ordering. Similarly, if I want to pass in an array of pairs, I can write some code to flatten them, or it might be more convenient to pass a pair of arrays instead.

Once we have this working well, it should be relatively easy, I think, to build a library on top of it, or extend it with extra types. I can see some potential challenges (e.g., generic packing of records, converting Integer to bytes can be slow by division, etc.), but I am thinking that we could handle stuff like that by extending Cryptol with functionality that converts values to/from arrays of bytes, rather than mixing it in with the FFI. This should keep the code more manageable, and would give us extra functionality (e.g., it would enable us to serialize values to a file).

Edit: Thinking a bit more about this, Integer is a tricky case, because it does not have a fixed width representation, so we may have to deal with it directly in the FFI. If we decide to support it, I think we should add it after we've handled all the easy cases.

qsctr commented 2 years ago

@weaversa and @eddywestbrook mentioned that once we are done with the FFI it would be nice to reimplement something like the SuiteB AES primitives with FFI to try it out (as a separate module for now which can be a drop-in replacement for SuiteB). This would hopefully result in a speedup in code like this.

(FFI is almost done by the way, everything mentioned in @yav's earlier comment is already working except for size-polymorphic sequences)

qsctr commented 2 years ago

Since #1376 has been merged, containing an initial implementation of the FFI, I'm going to close this issue for now. I've moved a few things mentioned in this discussion that weren't included in the initial PR into separate issues with the FFI tag, so feel free to do that for any further FFI related features or discussions.

GaloisInc / cryptol

Support calls to foreign functions #1090

Representtaion

Ownership