ocaml-ppx / ppx_import

Less redundancy in type declarations and signatures
MIT License
89 stars 29 forks source link

Fix syntax #62

Open pitag-ha opened 3 years ago

pitag-ha commented 3 years ago

With the syntax type t = [%import: Foo.bla], the [%import: Foo.bla] is written at a place where the OCaml parser expects a core type extension node. But ppx_import doesn't only want to inject a core type (ptype_manifest) into the corresponding type declaration. It also wants to inject the concrete type definition (ptype_kind), which breaks with the idea of an extension node being local.

To express ppx_import in an OCaml AST conform syntax, it would have to be written as a extension node at the level of a structure item. Derivers such as sexp or yojson would just be packed into the payload. That would look as follows:

[%%import : type t = Foo.bla [@@deriving sexp,yojson]]

or, expressed in a nicer way:

type%import t = Foo.bla [@@deriving sexp,yojson]

That syntax would be conform with the OCaml AST and, as a nice "side-effect", it would also be convenient from a source-code point of view.

Of course, releasing that syntax change would break all users. But it doesn't break the semantics or features, it only breaks the syntax and so one could use a simple sed command on a project in order to upgrade to the new ppx_import. You could also prepare for the breaking change by making a 1.x release allowing both syntaxes, but deprecating the old one.

What do you think about the proposed syntax in general? And what do you think about introducing it in a major release?

(cc @NathanReb since we've been talking about this)

pitag-ha commented 3 years ago

@gasche has pointed out here (in 3.) what I meant with "ppx_import doesn't conform to the OCaml AST":

When you say that ppx_import does not conform to the OCaml AST, you mean that the interpretation of the extension point depends on its placement in a wider context, and transforms other parts of the program than just the AST node it is placed at.

The end of the sentence captures exactly what I want to say: the expansion of the node spreads onto the rest of the tree. From what I understand, extension nodes are meant to be expanded (as nodes), not to expand the rest of the tree.

gasche commented 3 years ago

The proposed syntax is okay, and with the type%import version it is actually nicer than the current syntax. However I don't understand how [@@deriving] directives would be handled with this version. Currently, if I understand correctly, ppx_import is not aware of any ulterior transformation that occur after it imports the definition; it works with [@@deriving] or other extensions without explicit coordination. Does the proposal you have in mind require explicitly handling these attributes in some way? Or is the idea just to collect those attributes and re-apply them to the expanded structure/signature items? But if it is the latter, why not use [%%import type t = foo] [@@deriving bar]?

gasche commented 3 years ago

From what I understand, extension nodes are meant to be expanded (as nodes), not to expand the rest of the tree.

Outside ppxlib, there are no strict requirements on how extension nodes should be interpreted by processors. There are also several good reasons why the interpretation of an extension node may sometimes want to leak outside its initial AST placement. Consider for example an extension that expands to an expensive computation that does not depend on variables in its lexical scope, and only needs to be done once per program execution. You want

let foo x = [%expensive computation]

to generate something like

let __expensive_computation = lazy ....
let foo x = Lazy.force __expensive_computation

One way to work around this issue is to require that such extension nodes are themselves embedded into larger extensions that span a larger scope than the possible rewriting. So one would have to write

begin%with_expensive_computations_inside
let foo x = [%expensive computation]
end;;

and have the interpretation of the outer extension do the job of lifting the fragments to the toplevel. However this approach:

Remark: this situation I describe is not the situation of ppx_import, for which we may reasonably give a local semantics by just increasing the scope slightly. However it does suggest that supporting non-local expansions is useful/important in the general case.

pitag-ha commented 3 years ago

Ok, I see your point. Thanks for the example! So I take back my general statement about extension nodes only meant to be local in general.

Maybe we can agree on the following? For performance and for predictability, extension nodes ideally don't leak outside their original AST placement unless it's in their nature to do so. It is not in the nature of ppx_import to leak outside its original AST placement: it currently does so due to a choice made about its syntax that restricts its scope one step too much. Does that sound right to you?

pitag-ha commented 3 years ago

Does the proposal you have in mind require explicitly handling these attributes in some way? Or is the idea just to collect those attributes and re-apply them to the expanded structure/signature items?

The latter.

But if it is the latter, why not use [%%import type t = foo] [@@deriving bar]?

That's because if you write [%%import type t = foo] [@@deriving bar], the parser expects bar to be an attribute attached to an extension point, but bar is likely written as a type declaration attribute. You can see in this test that [%%import type t = foo] [@@deriving bar] would fail with "Error: Attributes not allowed here" (I was surprised myself yesterday when I wrote the test).

gasche commented 3 years ago

For performance and for predictability, extension nodes ideally don't leak outside their original AST placement unless it's in their nature to do so. It is not in the nature of ppx_import to leak outside its original AST placement: it currently does so due to a choice made about its syntax that restricts its scope one step too much. Does that sound right to you?

Yes, that sounds perfectly right to me.

(Note that, to someone who is not aware of the current AST structure, it is not clear that the extension handling is non-local: for all they can see, type t = [%import Foo.t] gets turned into something like type t = Foo.t = A | B, which at a source level seems to correspond to filling what's after the =.)

I think that the approach that you propose is nicer than the current one. (Especially now that I'm un-confused about the way attributes would be handled; thanks!). But it is not clear to me that the change is worth breaking all user code. Few things are, and I don't know that this particular case has benefits that outweigh the costs, except of course for the rather directly obvious benefit "it pleases ppxlib people, who maintain the whole ppx infrastructure in practice".

pitag-ha commented 3 years ago

except of course for the rather directly obvious benefit "it pleases ppxlib people, who maintain the whole ppx infrastructure in practice".

Another important benefit would be the improvement in performance!

has benefits that outweigh the costs

I agree that usually the cost of a breaking change is high and therefore there must be really good reasons to accept that cost. But take into account that in this particular case the cost of the breaking change is quite low since upgrading to the new version would really just be as simple as running a command along the lines

sed 's/type\(.*\)\[%import:\([^]]*\)\]/type%import\1\2/g'

one time.

ejgallego commented 3 years ago

IMVHO, the new syntax has several advantages, and given that there are only 10 rev deps in opam for ppx_import I think it'd be reasonable to deprecate it and remove it in a 2.0 version.

Personally, I am a fan of this kind of "little" improvements which make maintenance easier. On their own may not seem much, but in the wider context they do really add up.