LPCIC / elpi

Embeddable Lambda Prolog Interpreter
GNU Lesser General Public License v2.1
283 stars 35 forks source link

How to get started on embedding ELPI into an OCaml project? #178

Closed kiranandcode closed 1 year ago

kiranandcode commented 1 year ago

I'd like to try embedding ELPI into my OCaml projects, however I've been somewhat struggling to work out how to pass data from OCaml to Elpi and visa versa.

There isn't much documentation on the API, but by messing around with it, I was successfully able to call Elpi from my OCaml code:

let elpi = Elpi.API.Setup.init ~builtins:[] ()

let program = Elpi.API.Parse.program_from ~elpi
    ~loc:(Elpi.API.Ast.Loc.initial __FILE__) (Lexing.from_string ~with_positions:true "hello(let(\"x\", \"y\")).")

let query = Elpi.API.Parse.goal ~elpi
    ~loc:(Elpi.API.Ast.Loc.initial __FILE__) ~text:"hello(X)"

let compiled_query = Elpi.API.Compile.query
    (Elpi.API.Compile.program ~elpi [program])
    query

let executable =
  Elpi.API.Compile.optimize @@
  compiled_query

let () =
  Format.printf "program is %a\n%!" Elpi.API.Pp.program compiled_query;
  print_endline @@ match Elpi.API.Execute.once ?max_steps:None ?delay_outside_fragment:None executable
  with
  | Elpi.API.Execute.Success { assignments; constraints; state; output=(); pp_ctx } ->

    let term = Elpi.API.Data.StrMap.find "X" assignments in
    begin match Elpi.API.RawData.(look ~depth:1 term) with
      | App (constant, f, args) ->
        Format.printf "found App(%d,%a,[%a])\n%!" constant (Elpi.API.Pp.term pp_ctx) f
          (Format.pp_print_list (Elpi.API.Pp.term pp_ctx)) args;
      | _ -> ()
    end;
    Format.asprintf "assignments:\n%a\nconstraints:\n%a\nstate:\n%a\n%!"
      (Elpi.API.Data.StrMap.pp (Elpi.API.Pp.term pp_ctx)) assignments
      (Elpi.API.Pp.constraints pp_ctx) constraints
      Elpi.API.Pp.state state
  | Elpi.API.Execute.Failure -> "failure"
  | Elpi.API.Execute.NoMoreSteps -> "no more steps"

I have some data in OCaml that I'd like to expose to be available to query from Elpi. The problem is that I can't for the life of me figure out how to actually programmatically embed data into Elpi.

I've tried recreating my query above but constructing the AST directly in OCaml using the RawData module:

let let_ = Elpi.API.RawData.Constants.declare_global_symbol "let"
let hello = Elpi.API.RawData.Constants.declare_global_symbol "hello"
let term = let open Elpi.API.RawData in
  (mkAppL hello [mkAppL let_ [Elpi.API.RawOpaqueData.of_string "x"; Elpi.API.RawOpaqueData.of_string "y"]])

But if I convert this to a program using Elpi.API.Utils.clause_of_term and run the query, then it doesn't behave as my initial version did:

let program_2 =
  let open Elpi.API.RawData in
  Elpi.API.Utils.clause_of_term ~depth:1
    (Elpi.API.Ast.Loc.initial __FILE__)
    term

(* ... same as above *)
(* ===> failure *)

Is this the right way to go about doing what I want?

Elpi is a really cool project! Thanks for making it!

It's a shame that so much of its cool functionality is somewhat hard to work out how to unlock.

gares commented 1 year ago

Hello, first of all thanks for trying!

The very specific problem you are facing is that:

$ echo 'hello(let("x","y")).' | elpi -parse-term
Raw term: (Ast.Term.App ((Ast.Term.Const hello),
             [(Ast.Term.App ((Ast.Term.Const let),
                 [(Ast.Term.App ((Ast.Term.Const ,),
                     [(Ast.Term.CData x); (Ast.Term.CData y)]))
                   ]
                 ))
               ]
             ))

while you are generating hello (let "x" "y"). I suggest you write the type declarations for your data type, I guess you don't want let to take a proposition.

I think I will have to write a new tutorial, since https://github.com/gares/mlws18/tree/master/toyml is a bit outdated.

In general the API expects that you have OCaml datatypes and that you write for each of them a Conversion.t. If the datatype is opaque (no syntax), then OpaqueData contains a facility to declare it. Similarly, if you have an ADT without binders you can use AlgebraicData to get a conversion for it. If your data has binders, then you have to write the conversion by hand using RawData (maybe one day I'll finish the ppx, for now it is manual). There is also a ContectualConversion thing, but I won't elaborate on it right now.

Then, once you have the conversions for your data type, you typically define these data types in Elpi and provide some APIs using these. You passed [] as the ~builtins argument, there you should pass your data types declarations and APIs. See this more recent example: https://github.com/Deducteam/lambdapi/pull/418/files#diff-c64f40cc234cc6e799234c7e247b8a3b5bbb65bb126c5557e96ef6a8598f1660R57 The same conversions can be used to build a query using the facility in the Query module.

kiranandcode commented 1 year ago

Hi @gares! Thanks for the prompt response and answering my query! I really appreciate it, this really helps.

I didn't think of using elpi's -parse-term option to debug the parsed contents, but that makes sense.

I think I have enough to now go and hack with Elpi properly --- a few small final questions:

in my case, my data does contain binders, so I suppose I'll need to use RawData; could you point me to any code that does this that I could look at to get the hang of it?[1][2]

Secondly, could you elaborate a bit on how exactly to represent bound variables using the RawData api? Without documentation, it's a little bit trial and error to work out how I should encode a lambda term -- let's say (fun x -> x + 1). From looking at the implementation of RawData.mkBound here, I guess variables are using some kind of De Bruijn encoding?

[1] Looking at the lambda api code you mentioned, I guess this might be what you mean? [2] I was looking at coq-elpi, but it's somewhat hard to isolate the parts that are specific to interfacing with Coq versus the general patterns for embedding elpi.

gares commented 1 year ago

This is the embed/readback function for terms in lambdapi: https://github.com/Deducteam/lambdapi/pull/418/files#diff-2d4a49aabc7dd1490d6508d76d93d13727dbc814d5480e01c8f4b46beb82d63cR82 (note that Ctxt.unbind and Ctxt.type_of are "hacked", I store there a mapping from variable names to the current depth).

It would help to know how you represent binders in OCaml. In any case, in Elpi bound variables are represented with DeBruijn levels (see https://hal.inria.fr/hal-01176856/document) starting with 0, so

\lambda x. x + 1 (* latex *)
Lambda( "x", Plus(Var "x",Int 1)) (* ocaml? *)
lambda x\ plus x (int 1)  (* Elpi *)
App(lambda_, Lam (App(plus_ ,Const 0,[App(int_ ,CData 1,[])]), []) (* RawData *)

So, in the conversion from OCaml you have too keep a map from strings to the depth at which the variable was bound, here "x" -> 0. If binders are represented in OCaml as DeBruijn indexes, then levels are the dual (index i is level current_depth - i).

gares commented 1 year ago

If you project is public, I can comment on your current patch. It may be simpler for me to give you precise feedback.

kiranandcode commented 1 year ago

Ah, nice, yes, that really clears things up. Yes, I guess I must have forgotten about the use of De Brujin Levels for Elpi rather than indices, and that was tripping me up.

W.r.t a patch, thanks for the very kind offer, unfortunately most of the code I have is mostly summarised in the code above, so there's currently not much to look at --- as a personal project, I'm just currently experimenting with writing a meta-programming ppx for OCaml syntax that uses Elpi to allow the user to express transformations ~ a la the way it works in Coq currently. I think I have enough now to get started, will update if I run into anything else.

kiranandcode commented 1 year ago

I guess also, if you're interested, separately, in terms of research, we are currently exploring a using Elpi + Coq to extend some work that we will be presenting at this coming PLDI[1]. Most of our experiments in this direction are currently in the initial stage and are actually written purely using coq-elpi's Coq plugin directly and we don't use Elpi's OCaml API directly, but our tool already calls into Coq from OCaml, so eventually we plan to probably interact directly with Elpi as well.

So likely we'll have some more questions about elpi in the future.

[1] Code should also be publicly available soon.

gares commented 1 year ago

meta-programming ppx for OCaml syntax that uses Elpi

Then we really need to talk, since I did try that a while ago (and never finished).

https://github.com/LPCIC/elpi/tree/ocaml-elpi

I had a ppx to automatically synthesize Conversion.t for an OCaml ADT (with support for binders) and applied it to the OCaml AST. Then you can write a ppx working on OCaml files in Elpi. Since the AST is large, the former ppx is a prerequisite for the latter. Maybe I have some engineering resources to revive the former, but not the latter.

kiranandcode commented 1 year ago

Ah! Very nice, this is very cool.

In which case, to avoid duplicating effort, I'd be happy to hack on that instead. Let me try cloning the repo and checking out the code (it seems from the commit you pointed me to, the diff has enough information for me to work out which files to look at) and get back to you.

gares commented 1 year ago

Before you lose time, it is a very old branch. I think it makes sense I bring it back to life (rebase)

gares commented 1 year ago

But anyway, feel free to look at it. I'd be happy if something comes out of that branch

kiranandcode commented 1 year ago

Ah, missed this message earlier. Yep, I should have rebased, but I did notice that the branch was very old, and instead decided to hack on the master branch.

As an initial experiment, I just copied over the ppx_elpi and ocaml_elpi folders to the master branch and have been hacking to get things working.

I've now fixed most of the dependency related low-hanging changes (the OCaml AST had changed so had to update the ocaml_ast_for_elpi file) and the ppx now runs, however the code it produces seems to be making use of a class Elpi.API.Conversion.ctx that no longer exists or has been moved (I'm getting errors Unbound class type Elpi.API.Conversion.ctx). Does this class still exist? or do I need to make some more structural changes to the elpi ppx?

Current branch is: https://github.com/Gopiandcode/elpi/tree/reocaml-elpi but, I'll make a new branch rebasing the old changes onto master.

gares commented 1 year ago

Does this class still exist? or do I need to make some more structural changes to the elpi ppx?

That is the mess I wanted you to avoid ;-)

The current API has two conversions, Conversion.t and ContextualConversion.t. The ppx was targetting a different API, which was grown in the same branch, my bad, where the two unified.

To summarize, the branch in question includes:

I think I have a branch where I try to just have the first change, but I did not manage to merge it. I don't recall exactly if the problem was just lack of time, or that it was not OK (Coq-Elpi is a heavy user of the API, I'm pretty sure I wanted to validate the new API on it before making a decision).

These are the things I wanted to explain. There are more things which are too hackish in that branch. If you have energy to put in there, let's talk quickly ASAP so that I can explain all I recall and we can devise a plan.

Otherwise I suggest you don't try to rebase, play with the idea in that old version of Elpi and OCaml, and if it looks promising then discuss how to revive the thing.

kiranandcode commented 1 year ago

No worries, it only took a few minutes to get that branch compiling, so nothing much was lost :)

These are the things I wanted to explain. There are more things which are too hackish in that branch. If you have energy to put in there, let's talk quickly ASAP so that I can explain all I recall and we can devise a plan.

Yes, I'm happy to hack on this some more.

To summarize, the branch in question includes:

  • large API change
  • ppx targetting that API (which is more "homogeneous")
  • ocaml-ppx uses the ppx for

I had a sneaking suspicion that this might be the case after seeing some of the compiler errors.

Okay, so do you think that the large API change is required to implement the PPX? or can the PPX be implemented with the current Conversion.t API?

As I'm less familiar with the design goals and direction of Elpi itself, I'm assuming that it would be more straightforward for me to re-implement your earlier ppx for the new API rather than making large changes to the API.

Of course, if this is not possible, then I am happy to try and change the API, although it would be helpful if you could give a few pointers on what the limitations are of the current API and what needs to be added to support automatically deriving instances.

gares commented 1 year ago

I think the simplest way forward is to make the ppx target the current API.

The drawback is that, in case of ADTs with binders, say a and b, you will generate ca : (ctx_a,a) ContextualConversion.t and cb : (ctx_b,a) ContextualConversion.t and you will be unable to use these two types together in a builtin, eg

  MLCode(Pred("mix_and_match",
    CIn(ca,   "X",
    CIn(cb,   "Y",
    Full(ctx_a_and_b, "takes X and Y and does stuff in the current context")),
  (fun x y ctx ~depth ->... )),
  DocAbove);

The new API was allowing that, using objects to make a "union" of the context for a and for b. The current API imposes all arguments (with contextual types) to have the same context.

To give an example, in Coq-Elpi I have a ContextualConversion.t for Coq terms, so that you can for example call Coq's type checker or unifier deep inside a term. See https://lpcic.github.io/coq-elpi/tutorial_coq_elpi_HOAS.html#the-context

So, after all, the big API change was "cool" but not necessary in my concrete case since I only have one contextual datatype in Coq-Elpi. But well, I did an academic sin ;-) and aim at the general case in that branch.

You could also target Conversion.t, but that would only be an intermediate step, since the whole point of using Elpi would be to have real support for binders.

There is another mess in that branch. When you apply the ppx to a large ast like OCaml's one you immediately realize that you also want a copy/map API for the terms. And I had the bad idea to quickly hack this in, generating "text" rather than proper AST. The map.* set of predicates you see here: https://github.com/LPCIC/elpi/blob/ocaml-elpi/ocaml-elpi/tests/test_swap.elpi , idea being you only want to write the code which maps the AST node you want to elaborate, and not write by hand the identity on the rest.

kiranandcode commented 1 year ago

Okay, perfect. I think I have enough now to go on; I'll write a ppx targetting creating a ContextualConversion.t embedding for each datatype in the OCaml AST --- my first prototype will probably just treat all binders the same and use the same context for all embeddings, and then I'll expand to try different contexts for different kinds of binders.

Thank you for patiently explaining all the details!

I'll report back here once I've made some progress.

gares commented 1 year ago

Great, I hope you can salvage something from my old code.

About generating the Elpi code to map a term, the API lack support for the AST of, eg, types déclarations. I can surely help adding these.