Open keigoi opened 1 month ago
This looks like impressive work, but it's also not so clear what it does right now from the code. Some early comments:
When you decide to un-draft this PR, it would help if you could show a simple example of type and generated printing function in the PR description. This helps build intuition when reading the code.
From a distance I wonder if the logic could be simplified a bit if you allowed yourself to cut some corners, for example:
let rec foo : 'a 'b 'c . ....
.type ('a, _) t = ...
, use GADT mode for the second parameter. This is different from what we do today, and I wonder if there are scary compatibility issues to think about, for now it's just a thought.Hello @gasche, thank you for your prompt comments! I filled up the summary now, and any further comments/edits will be highly appreciated. Thanks!
This PR introduces
[@@deriving show]
for a subset of GADTs.Code might look a bit ugly for now, and there are some rooms to be improved. So, please discuss, suggest, and/or comment on this PR. Thanks!
Summary
Before thinking about the implementation, let me re-cap the "calling convention" for derivers on certain types (which I believe). For polymorphic types like
'a foo
, Ppx_deriving generates functions like:See that here is an extra parameter
poly_a : formatter -> 'a -> unit
for pretty-printing its type parameter'a
, and let's call such a parameter (abovepoly_a
) a sub-instance.On calling pretty printers for such (non-GADT) types, the caller will supply sub-instances for the argument types. For example, the deriver will generate
(pp_foo pp_int)
for printingint foo
wherepp_int : formatter -> int -> unit
.Sub-instances are often unused for GADTs
The observation is that, such sub-instances like
poly_a
above are usually never invoked for certain GADTs. Instead, type parameters are eventually refined to a ground type and pretty-printed based on that (concrete) type without the need for supplying the sub-instances.This PR will exploit the fact. Here, we would have a few options for that:
'b
):The current PR follows this strategy. By relying on this signature generation policy, a larger class of GADTs will (hopefully) be handled without introducing compatibility issues (see below).
Here are what this PR extends Ppx_deriving (show), but I suppose the same strategy would be effective for other derivers as well:
1. Generating "degenerate" pretty-printers (instances)
One of obstacles for handling GADTs in the current Ppx_deriving (show) is that it cannot generate valid code for type variables in the refined types and for existentials. For example, the
Pair
below has type variables'x
and'y
for which we have no way to generate pretty-printers (instances), and the deriver will just put nonexistent functionspoly_x
andpoly_y
which leads to a compile-time error.Instead of that, based on the above "eventually grounded" assumption, this PR will generate a never-called, degenerate pretty-printer for both
'x
and'y
:which raises an exception, but is actually never called as it is guarded by the type
[this_type_is_refined]
polymorphic variant type which prevents accidental call. The case forPair
in the printerpp_baz
will be:This typechecks thanks to the "more liberal" signature above for GADTs.
2. New option
refined_params
for declaring the "refining" type parameterAs I described above, we need some way to specify which parameters are "refining" at first. For this purpose, the current code provides the option
refined_params=[ <list of parameter positions> ]
which can give the positions for refined type parameters to the deriver. For example,will produce
See that the first (0-th) argument of
pp_bar
is parametric over type'c
which is unrelated to the rest of signature.(As @gasche mentioned in his comment, it would be more preferable if we could just put
_
in the type declaration to specify this. However, It is not compatible with the current behaviour which just eliminates the sub-instance argument for the corresponding type parameter e.g._ foo
will lead topp_foo : formatter -> 'a foo -> unit
without argumentpoly_a
.)3. Semi-automatic inference for refined parameters
In the implementations (.ml), refined parameters are usually apparent from the return type of constructor types. For this, I implemented a quick code to decide which types are actually defined. However, I admit that this convention to use type variable names to specify refined parameters is rather convoluted...
See: https://github.com/keigoi/ppx_deriving/blob/d2ccd5fc983e3fb5f19fa4a2408ae923de8c06a8/src_plugins/show/ppx_deriving_show.ml#L247-L283
4. Wrapping functions with locally abstract type signatures (newtypes)
Apparently we need this to handle pattern-maching on GADTs.
See: https://github.com/keigoi/ppx_deriving/blob/e18c11e4e56349d52dcd0acb5e203d6f555ba262/src_plugins/show/ppx_deriving_show.ml#L394-L399
(I agree that this is better to be done unconditionally (and to replace the current signature), as @gasche had mentioned.)
Related: https://github.com/ocaml-ppx/ppx_deriving/issues/7