Open atacratic opened 4 years ago
@atacratic A distributed API will need to be able to distinguish computations that can run on any Unison node (i.e., computations of type '{IO} a
) from those that require additional capabilities that may only be available on a subset of the available nodes (e.g., '{IO, GPU} a
), so you wouldn't want FFI-dependent programs to be handled into IO
alone.
Preserving an explicit dependence on an FFI-backed ability would also neatly resolve the question you raised about how library authors expose FFI-dependent functions — library authors or end-users would just use the Foo
ability's operations in their code and not worry about handling the ability explicitly. Instead, the runtime itself could implicitly handle available FFI operations (just as the current runtime implicitly handles IO
). That is, whenever foreign code for handling Foo
is linked into the runtime via a plugin mechanism, the runtime would evaluate any computation '{IO, Foo} a
in a context in which the runtime itself provides IO
and Foo
. The runtime would also communicate with remote instances, letting them know which abilities it's capable of handling.
Because the FFI-dependent operations could crash, they should be required by the plugin mechanism to require the IO
ability (in addition to the implicitly required ability):
unique ability Foo where
foo : String ->{IO} Boolean
(Unique abilities would aid in versioning / dependency management — a new Foo
could be added whenever "significant" changes to the foreign code necessitate it.)
This has the drawback that Foo
operations become difficult to mock for testing, but plugin authors could mitigate that by providing a separate ability with parallel structure along with a handler into the FFI-dependent ability:
unique ability FooFFI where
foo : String ->{IO} Boolean
ability Foo where
foo : String -> Boolean
liveFooHandler : Request {Foo} a ->{FooFFI} a
Nice writeup. I think the basic idea of "FFI represented as abilities, and there's some way of installing a new top-level handler and/or augmenting the IO
handler to process the FFI ability by linking against some local library" is totally reasonable. This also has the bonus that anyone in the Unison ecosystem can write and share programs using that ability (since the ability itself is just pure Unison), so no funky stuff where your "build is broken because you're missing some random C library". It's only when you run the computation that you need a handler for the ability, and if the FFI abilities are just installed as new top level handlers, you can have it just produce a regular type error / ability check failure if you try to run something you don't have locally.
Seems like there's some details to sort out.
I imagine some UCM commands for adding a new top-level handler which run
computations get to access.
It's interesting to think about three approximate classes of FFI binding.
Key question: do we allow people to write new FFI bindings in Unison?
If you have to write some Haskell to create a ucm
plugin, that will massively reduce how much people actually feel inclined and able to do it. It will basically be seen as impractical by many people who would otherwise be doing item 3 above. I think that would be a significant barrier for Unison adoption. "Yeah, you can't call out to Python or Java without writing a compiler plugin in Haskell..."
(Also the plugin approach increases the amount of hassle you have, distributing plugin binaries to your various nodes - we want Unison to be helping reduce that kind of thing.)
For inspiration it's good to look at Idris 2's recently documented approach to FFI, here - click 'FFI Overview'. You decorate your idris source with annotations that describe the bindings you want.
%foreign "C:puts,libc"
puts : String -> PrimIO Int
Having top-level handlers for FFI abilities does seem nice. It's good to be able to talk about your '{IO, Foo} ()
program. And you're right @anovstrup that this improved expressiveness about a program's runtime requirements will work out better with distribution.
But I don't think that should preclude being able to express the bindings in a way that people think of as being in Unison itself.
You could imagine something like...
ability Foo where
%foreign "C:foo,libfoo"
foo : String -> Boolean
-- In `ucm` run `bind C Foo` to install a top-level handler for the `Foo`
-- ability. Requires the system dynamic loader to be able to find libfoo.
(Or maybe the annotation is done directly using the ucm
metadata commands, without an inline sugar - that would be fine.)
I feel like having multiple top-level handlers is then a good reason to break up IO
(into File
, Socket
, Random
, Timer
, Clock
, ConsoleIn
, ConsoleOut
,Threads
or whatever.) Otherwise it's a mental glitch for learners, "OK, so actually there are lots of abilities that let me do different kinds of IO, but only one of them is called IO
..." The distinction would just come down to the implementation accident of what happened to be exposed natively via the Haskell runtime, which seems a bit ugly. Plus the extra expressiveness of the types would be a win. Ideally the status of the Threads
(etc) abilities would be more or less the same as any other FFI-backed ability: it has a regular Unison ability declaration, which just happens to be decorated with some metadata saying things like %foreign "builtin:fork"
. So they'd seem less magic than IO
seems now.
@atacratic I don't know if I read your most recent comment carefully back when you wrote it, but I agree with your points about the drawbacks of a compiler-plugin approach. After reading the whole write-up in light of your latest comment, I came back around to your original idea of a built-in special form that would conjure up an ability handler. e.g., base.io.bindForeign C Foo libfoo
defines a Request Foo a ->{IO} a
function. (Or should it be Request Foo a ->{FFI} a
?)
Regarding multiple top-level handlers, I think it's inevitable that that's coming (in fact, we already have IO
and Exception
in the Haskell runtime). I still kind of like the idea of breaking IO
up like you suggest, but I can understand the resistance to that (the fact that any carving up of the ability requires some arbitrary decisions). If monolithic IO
is preserved (with or without that name), I think it could be helpful to clarify what belongs in it and what doesn't. Is it the set of operations that all Unison runtimes must provide? Is it just the set of operations provided by the standard runtime, and other runtimes might provide other abilities (e.g., Browser
) instead of IO
? (What about operations that other runtimes can support even if they can't provide all of IO
? Do they just claim to provide IO
and then fail at runtime on unsupported operations (yuck!)?)
Here are some thoughts about FFI, since I got to thinking about it. Might be worth some discussion?
Summary
Unison should support some kind of Foreign Function Interface mechanism for interacting with non-Unison code. Doing FFI should be a kind of
IO
. Clearly it should fit in with abilities and handlers somehow. And we should think hard about whether to place some kind of restrictions on FFI, to improve the quality of the Unison library ecosystem that develops - maybe taking a leaf from Elm's book here.Detail
1) We want Unison to be able to invoke code written in other languages
This is not entirely obvious: an alternative would be to say something like "Unison can interact with other systems as long as they expose REST APIs." It would certainly be simpler for Unison, to reduce the whole problem to one of sockets and JSON. That also gives us a well-understood story for testing, mocking and tracing.
But this would impose a huge Unison adoption penalty: sure, come use Unison, once you've wrapped all your pre-existing code up as a web service. Plus it gives the user a bunch of old-school distribution and orchestration complexity as well.
So we do want to be able to invoke code written in other languages directly, via some kind of FFI mechanism.
2) We only want FFI to happen during
IO
programs.So, we don't want some arbitrary Unison function of type
String -> Boolean
to be able to call out to foreign code.Now, it might make sense to allow this: let's say the programmer swears on their honor that their favourite C function
bool foo (char *)
reads no files and launches no missiles; that it simply takes its argument and computes on it to return a result. Then it's arguably convenient to make that accessible as a UnisonString -> Boolean
. But let's look at what could go wrong:String -> Boolean
, that this function is purefoo
could fail at runtime - a possibility again not visible in theString -> Boolean
type.Rather than leave Unison's consistency hostage to mistakes made by authors of FFI bindings, or to linker failures, let's instead say FFI is a kind of
IO
, and no FFI happens except when runningIO
programs.3) We don't want to try and internalize the content of foreign functions into Unison term hashes
Suppose a Unison program
myProg : {IO} ()
uses FFI. Should the hash ofmyProg
depend on the foreign code? In one sense, the answer might be yes: if the foreign code is different, then the program does different things, so surely we should treat it as a different program?But at a pragmatic level, this clearly can't work: even if we could hash
libfoo.so
, or whatever foreign code artifact we call out to, we'd also need to hash all the transitive dependents of that object too, even if those were themselves FFI bindings into other foreign languages...This note expands on the problems with this approach.
So Unison hashes should not try and encode the whole content of the foreign code they may invoke.
This seems natural, if we think of the foreign code as an external resource with which we are performing IO.
Possible beginnings of a design
The above suggests a design starting off along the following lines.
Suppose you have a C function
bool foo (char *)
, and you want to write a Unison FFI binding to it. Then you write:... and merrily use that from your code in the normal way.
The magic then comes in how you cook up a handler (into IO) for that ability.
Suppose you have a function
myStuff : '{Foo} ()
. Then you write something likeThat call to
BindNative
cooked up a handler of typeRequest Foo a ->{IO} a
.What happens during typechecking of the call to
BindNative
? Is it working out that it needs to eliminate{Foo}
(maybe a bit ambitious), reflecting over the ability declaration, checking it knows how to translate all theFoo
operations into the C ABI...? Magic would be permissible here I think. Anyway, something would be possible, if you pass enough arguments to BindNative.Issue: what does a library author do? Do they expose APIs like
Baz ->{Foo} Bar
? That seems preferable to exposingBaz ->{IO} Bar
, since we want to move to IO only at the last possible moment - but then the user is required to know about "libfoo". Is some more magic needed to attach that as metadata to theability Foo
, accessible byBindNative
?Ecosystem considerations / comparison to Elm
Elm takes a very interesting line on FFI. (reference: Elm ports docs and in particular the design considerations section at the bottom)
In Elm's case, the FFI it's thinking about is interaction with other JS available in the webpage being rendered.
The rationale for the ban on FFI from library code is as follows.
Those are all pretty damn strong arguments, and I see them all applying to Unison. (OK, I'd like Unison to grow a 'never crashes' guarantee like Elm...)
However, this stricture has caused considerable gnashing of teeth in the Elm community, as it makes it harder to write libraries. (This is particularly frustrating for people in Elm's case for libraries needed to access platform facilities not yet otherwise exposed by Elm.) Not a decision to take lightly.
And it's not clear how an 'FFI is only for app-writers' rule could be mirrored in Unison. Maybe we encourage a taboo around exposing library APIs that mention
IO
. But that doesn't cut it, if all it achieves is to leave it to the application writer to be the one to finally bind some non-idiomatic abilities to some crashy, side-effecting foreign code.I don't have any answers there, but it's definitely worth thinking about how our FFI story will play out in terms of the quality of the ecosystem and the development experience.
Tagging @pchiusano @aryairani @runarorama