stedolan / ppx_stage

Staged metaprogramming in stock OCaml
MIT License
150 stars 7 forks source link

When does ppx_stage code evaluation happen? #11

Closed rizo closed 6 years ago

rizo commented 6 years ago

I wasn't really sure where to ask this simple question, sorry if I'm misunderstanding the purpose of this ppx.

My question is: when does ppx_stage code evaluation happen? Am I correct to assume that the generated binary is a code generator that produces the expanded version of the code? And if so does this mean that the produced source code still needs to be compiled by the standard OCaml compiler?

This is the conclusion I make by looking at a basic example which I include for completeness.

Original input:


[%%code
  let square x = x * x
]

let rec spower n x =
  if n = 0 then [%code 1]
  else if n mod 2 = 0 then [%code square [%e spower (n/2) x]]
  else [%code [%e x] * [%e spower (n-1) x]]

let spower7 = [%code fun x -> [%e spower 7 [%code x]]]

let () =
  Format.printf "@[%a@]@." Ppx_stage.print spower7

Using ppx_stage this code will compile a generator that when run produces the following output source code:

fun x -> x * (square (x * (square (x * 1))))

This is a string that can be stored in a file and can be compiled to produce the desired final output.

Again for completeness here's the code of the generator generated by ppx_stage:

module Staged_656268120 =
  struct
    let square x = x * x
    let staged0 hole''_1 hole''_2 =
      let contents''_1 = hole''_1 in
      let contents''_2 = hole''_2 in
      {
        Ppx_stage.compute =
          (fun env'' ->
             (contents''_1.Ppx_stage.compute env'') *
               (contents''_2.Ppx_stage.compute env''));
        Ppx_stage.source =
          (fun ren'' ->
             Ppx_stage.Internal.substitute_holes
               (Marshal.from_string
                  "\132\149\166\190\000\000\000\192\000\000\000+\000\000\000\159\000\000\000\158\176\165\176\144\160\144!*\176\192&pow.mlI\001\000\144\001\000\165\192\004\002I\001\000\144\001\000\166@\176\192\004\004I\001\000\144\001\000\165\192\004\005I\001\000\144\001\000\166@@\160\160@\176\144\160\144\",1\176\192\004\014I\001\000\144\001\000\160\192\004\015I\001\000\144\001\000\161@\176\192\004\017I\001\000\144\001\000\158\192\004\018I\001\000\144\001\000\164@@\160\160@\176\144\160\144\",2\176\192\004\027I\001\000\144\001\000\169\192\004\028I\001\000\144\001\000\170@\176\192\004\030I\001\000\144\001\000\167\192\004\031I\001\000\144\001\000\186@@@\176\192\004!I\001\000\144\001\000\158\192\004\"I\001\000\144\001\000\186@@"
                  0)
               (function
                | Ppx_stage.Internal.SubstHole 1 ->
                    contents''_1.Ppx_stage.source ren''
                | Ppx_stage.Internal.SubstHole 2 ->
                    contents''_2.Ppx_stage.source ren''
                | _ -> assert false))
      }
    and staged1 hole''_1 =
      let contents''_1 = hole''_1 in
      {
        Ppx_stage.compute =
          (fun env'' -> square (contents''_1.Ppx_stage.compute env''));
        Ppx_stage.source =
          (fun ren'' ->
             Ppx_stage.Internal.substitute_holes
               (Marshal.from_string
                  "\132\149\166\190\000\000\000|\000\000\000\030\000\000\000q\000\000\000o\176\165\176\144\160\144&square\176\192&pow.mlH\000R\000t\192\004\002H\000R\000z@\176\192\004\004H\000R\000t\192\004\005H\000R\000z@@\160\160@\176\144\160\144\",1\176\192\004\014H\000R\000}\192\004\015H\000R\000~@\176\192\004\017H\000R\000{\192\004\018H\000R\001\000\142@@@\176\192\004\020H\000R\000t\192\004\021H\000R\001\000\142@@"
                  0)
               (function
                | Ppx_stage.Internal.SubstHole 1 ->
                    contents''_1.Ppx_stage.source ren''
                | _ -> assert false))
      }
    and staged2 =
      {
        Ppx_stage.compute = (fun env'' -> 1);
        Ppx_stage.source =
          (fun ren'' ->
             Ppx_stage.Internal.substitute_holes
               (Marshal.from_string
                  "\132\149\166\190\000\000\000\028\000\000\000\b\000\000\000\028\000\000\000\027\176\145\160!1@\176\192&pow.mlGx\000O\192\004\002Gx\000P@@"
                  0) (function | _ -> assert false))
      }
    and staged3 hole''_1 =
      let x''b1 = Ppx_stage.Internal.fresh_variable "x" in
      let contents''_1 =
        hole''_1
          {
            Ppx_stage.compute =
              (fun env -> Ppx_stage.Internal.compute_variable x''b1 env);
            Ppx_stage.source =
              (fun ren -> Ppx_stage.Internal.source_variable x''b1 ren)
          } in
      {
        Ppx_stage.compute =
          (fun env'' ->
             fun x ->
               contents''_1.Ppx_stage.compute
                 (Ppx_stage.Internal.Environ.add env'' x''b1 x));
        Ppx_stage.source =
          (fun ren'' ->
             Ppx_stage.Internal.substitute_holes
               (Marshal.from_string
                  "\132\149\166\190\000\000\000\134\000\000\000\027\000\000\000j\000\000\000i\176\196@@\176\144\160!x\176\192&pow.mlL\001\000\190\001\000\215\192\004\002L\001\000\190\001\000\216@\176\192\004\004L\001\000\190\001\000\215\192\004\005L\001\000\190\001\000\216@@\176\144\160\144\",1\176\192\004\012L\001\000\190\001\000\222\192\004\rL\001\000\190\001\000\223@\176\192\004\015L\001\000\190\001\000\220\192\004\016L\001\000\190\001\000\243@@\176\192\004\018L\001\000\190\001\000\211\192\004\019L\001\000\190\001\000\243@@"
                  0)
               (function
                | Ppx_stage.Internal.SubstHole 1 ->
                    (Ppx_stage.Internal.Renaming.with_renaming x''b1
                       contents''_1.Ppx_stage.source) ren''
                | _ -> assert false))
      }
    and staged4 x''v0 =
      {
        Ppx_stage.compute = (fun env'' -> x''v0.Ppx_stage.compute env'');
        Ppx_stage.source =
          (fun ren'' ->
             Ppx_stage.Internal.substitute_holes
               (Marshal.from_string
                  "\132\149\166\190\000\000\0009\000\000\000\012\000\000\000,\000\000\000+\176\144\160\144\";0\176\192&pow.mlL\001\000\190\001\000\240\192\004\002L\001\000\190\001\000\241@\176\192\004\004L\001\000\190\001\000\240\192\004\005L\001\000\190\001\000\241@@"
                  0)
               (function
                | Ppx_stage.Internal.SubstContext 0 ->
                    x''v0.Ppx_stage.source ren''
                | _ -> assert false))
      }
  end
let rec spower n x =
  if n = 0
  then Staged_656268120.staged2
  else
    if (n mod 2) = 0
    then Staged_656268120.staged1 (spower (n / 2) x)
    else Staged_656268120.staged0 x (spower (n - 1) x)
let spower7 =
  Staged_656268120.staged3 (fun x -> spower 7 (Staged_656268120.staged4 x))
let () = Format.printf "[@[%a@]]@." Ppx_stage.print spower7

Which is clearly not a "residual" program.

Are my assumptions correct? How would a complete compilation pipeline look for an actual application? Is it possible to evaluate the generated code in compile-time (using the bytecode interpreter for example)?

Thanks in advance!

stedolan commented 6 years ago

My question is: when does ppx_stage code evaluation happen? Am I correct to assume that the generated binary is a code generator that produces the expanded version of the code? And if so does this mean that the produced source code still needs to be compiled by the standard OCaml compiler?

Yep, that's it. I'd like to automate this step of calling out to the OCaml compiler, but haven't gotten around to it yet.

rizo commented 6 years ago

Thank you for clarifying that. Do you have any ideas how that could be done? Also could you recommend any literature on staging or macros in general?

I'm going to close this issue, thanks again.

Drup commented 6 years ago

@rizo Well, that's what metaocaml is for, isn't it ? :) You can read this webpage.

stedolan commented 6 years ago

Technically, calling out to ocamlopt and passing it some source code (or a binary parse tree) isn't particularly difficult. What's more annoying is getting the library search paths and dependencies right, and it does depend on the runtime environment being the same as the build. Most of the annoyances here are build-system trouble.

@Drup's link is a good resource, and I'd also recommend the first few sections of Stream Fusion, to Completeness (even if you're not particularly interested in optimising streaming, there's a good intro to staging in the first few sections).

Drup commented 6 years ago

@stedolan You sort of gloss over one point which is non trivial: linking back against the generated code and have the generated program access the objects in memory of the host program without serialization.

stedolan commented 6 years ago

You sort of gloss over one point which is non trivial: linking back against the generated code and have the generated program access the objects in memory of the host program without serialization

ppx_stage doesn't do implicit CSP, so I don't think there's any particular difficulty here? I agree that implicit CSP is a tricky feature to implement!

Drup commented 6 years ago

It's not really a matter of implicitness, and more a matter of the semantics you choose. You could have something that is explicit but doesn't serialize.

Also, it goes both way: you need to retrieve the results also. :p

stedolan commented 6 years ago

I'm afraid I still don't understand what the difficulty is. I'm imagining passing the generated source code to ocamlopt, building a .cmxs, and using Dynlink to load the resulting .cmxs. Dynlink will automatically link any references to global modules, and that's the only sort of references that ppx_stage allows from staged to host code. Once linked, calling a staged function is just like any other, and involves no serialisation, either to pass the arguments or retrieve the result.

MetaOCaml-style CSP does make this trickier, but lacking that I don't see the difficulty.

Drup commented 6 years ago

@stedolan Ah, sorry, I was really just making a general comment on your answer "execution of staged code is easy" [not sic :p].

In the case of ppx_stage, You already have a specific semantics in mind that avoid the question, although it's arguably a bit boring. :)

rizo commented 6 years ago

(Sorry for the delayed reply!)

@Drup:

Well, that's what metaocaml is for, isn't it ? :) You can read this webpage.

Right. I was under the impression that MetaOCaml follows one particular approach that is tightly integrated with the type-system for macro type-checking (I don't understand all the details unfortunately). My main problem is understanding the relation between MetaOCaml, Modular Macros and the approach implemented in ppx_stage (which seems to be very straightforward).

Although I am very interested in staged programming I come from the C++/D background where staging is not strongly-typed but still very useful. I'm trying to learn about staging and super-compilation from implementor's perspective but navigating in this domain without guidance has been an obscure journey so far! :)

@stedolan: Thanks for clarifying how you want to achieve code execution I think it makes a lot of sense.

@Drup: Also could you explain why you consider explicit CSP boring? :)