idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.46k stars 369 forks source link

Add %foreign_impl pragma for augmenting ffi functions #3303

Closed dunhamsteve closed 3 weeks ago

dunhamsteve commented 3 weeks ago

Description

Currently in Idris, we can't add a new backend to an existing %foreign definition without editing the library containing the definition. This issue came up for me when I wanted to use Control.App in node and needed to stub out prim__fork for the node codegen. But it also arises when working with a third party backend.

In a discord discussion @stefan-hoeck suggested the following syntax to add additional bindings to an existing %foreign definition:

%foreign_impl Prelude.IO.prim__putStr "javascript:lambda:x=>..."

This implementation of that syntax follows the pattern of %foreign and allows an indented block of expressions that evaluate to strings. The test case uses the pragma to make Control.App work with the node codegen.

The general idea is to update the definition of the %foreign function to add the new bindings. We store the additional information in Options, accumulate it as TTC files are read in, and then update the defs in Compiler.Common. We can't directly modify the definition when the pragma is evaluated because the context is dropped after compiling the module.

Should this change go in the CHANGELOG?

stefan-hoeck commented 3 weeks ago

Very nice! Thanks for taking the time to implement this.

dunhamsteve commented 3 weeks ago

Currently, it just aggregates the directives. The most recently loaded module is first. I was on the fence on whether this should be an error as I can see a case for overriding the declaration in the library and possibly a case for a backend wanting more than one declaration for a given name. So I took the simpler path and the handling of duplicates is left to the backend.

From auditing the code (I haven't experimented to verify):

There is a little bit of ambiguity here, if two TTC are loaded that have an override for the same backend, the behavior depends on the load order. I do check that the name we override/augment is a %foreign, ignoring visibility, and give an error if it is not a foreign function.

Aside from what is already implemented, if we want to detect these conflicts, we could either give a warning or an error when there is a duplicate. We could check:

  1. When processing the pragma
    • This will miss the cases where two conflicting TTC are loaded independently
  2. When loading the TTC file and merging it's foreignImpl into the context
    • This approach gets us the name of the module with the conflicting definition
  3. When merging the foreign_impl into the context in Compiler.Common.addForeignImpl
  4. When a backend processes the list of bindings
    • The backends currently choose the first match, but they could be changed
    • This allows a backend to decide what to accept
    • We don't have FC on the individual bindings
mattpolzin commented 3 weeks ago

I think this capability may be too useful not to merge even with some shortcomings (though non-determinism is particularly unfortunate). Long term, I wonder if foreign functions would be more ergonomic if an error for ambiguity could be given and a hide directive could be used to address that error.

dunhamsteve commented 3 weeks ago

For now, I have added documentation for how duplicate declarations are handled.

Note that it is also possible to introduce ambiguity with the existing compiler by doing:

%foreign "javascript:foo"
         "javascript:bar"
something : Nat -> Nat