Ppxlib’s behaviour in case of raised exception.

There has been several "private" discussions about ppxlib’s behaviour in case a transformation throws an exception. Those conversations lead to decisions, but the reasons behind the decisions were not always easy to find. This issue is an attempt at solving this!

Past and current behaviour

When I started working on ppxlib, when a transformation raised an exception, the whole AST was removed and replaced by a single item, an error extension node containing the exception.

This was not a problem for the compiler, which reports the first error it encounters and stops. However, it was a problem for Merlin, which had no way to "recover from the error". None of its features could work on such an empty AST (see also https://github.com/ocaml-ppx/ppxlib/issues/314).

So, we needed a way to recover from the error, to give a reasonable AST to Merlin. After some discussions, we implemented in https://github.com/ocaml-ppx/ppxlib/pull/315 the following behaviour:

When a transformation raises, the last valid AST is used (that is, the one given to the transformation), with an error extension node appended at its beginning.
For context-free transformations, a raise would cancel the context-free phase: the last valid AST would be the one when starting the context-free phase.

This behaviour was discussed, also seeking advices from Merlin people. I think the reasonning behind this was to be able to give PPX authors the possibility to distinguish between a "hard" error, where continuing with the unrewritten AST would not make sense, and "recoverable" errors where it is possible to continue. Recoverable errors would be embedded manually in the AST by the rewriter.

However, there are problems with this approach:

PPXs were written with exception raising even in the case of recoverable errors, and it is not scalable to change them all
We could not find a single example of "non-recoverable" PPX error, where it is better to use the last valid AST
Not running the upcoming transformations leads to many unrelated errors (unbound values, unrewritten extension node...). Those unrelated errors were difficult to understand: they could be in completely different part of the file, and have no links with the actual error.

For instance in:

module A = struct
  type t = [%ext failing] [@@deriving show]
end

module B = struct
  type t = A | B [@@deriving show]
  let f = print A
end

there would be the "Unbound value print" error, which is hard to link with the failure of expanding [%ext A].

New behaviour

After starting to patch some rewriters to have them embed recoverable errors, doubts with the current approach were raised. We discussed the approach, and think the following is better:

When a transformation raises, the last valid AST is used, as input to the upcoming transformations. All errors are collected. At the end, the errors are appended, as extension nodes, at its beginning.
For context-free transformations, a raise would be caught and used in replacement of the generated code.

Continuing when a transformation has failed can lead to unrelated errors if some transformations depends on previous ones, but unrelated errors were also present in the previous approach. But the unrelated errors are at least all directly linked to the fact that a rewriter did not run. For instance:

module A = struct
  type t = [%ext failing] [@@deriving show]
end

module B = struct
  type t = A | B [@@deriving show]
  let f = print A
end

there would be the "Deriving show: Cannot derive extension points"", which is not ideal but understandable if [%ext failing] failed to derive.

Note that if we embed errors in the correct order, from the point of view of the compiler, nothing will change (except maybe a worse performance, see after.)

So:

There is no example of a case where we want panic from a rewriter to make ppxlib panic itself, so this modification frees all PPX authors (and us) from having to modify their PPX to embed errors
In both cases, there are unrelated errors, but in the new behaviour at least they are directly linked to the source of the problem.
The behaviour wrt to the compiler does not change

Performance

Not failing early is important to have as much rewritten AST as possible, for Merlin to be able to work. However, for the compiler, it would be better to fail early, since only the first error is used.

I don’t know how much of a problem it is... Maybe, one possibility would be to fail early in the case of the compiler, and fail late in the case of Merlin. In fact, that might already be the case: when the -embed-errors option is not set, no exception is caught, and the rewriter fails early. But I don’t know if this option is commonly set by build systems! Edit: It seems that dune does not use that flag when running PPXs.

Progress

@Burnleydev1 has already made some progress on implementing the new behaviour:

[x] When a transformation raises, the last valid AST is used, as input to the upcoming transformations. All errors are collected. At the end, the errors are appended, as extension nodes, at its beginning. https://github.com/ocaml-ppx/ppxlib/pull/447
[x] For context-free transformations, a raise would be caught and used in replacement of the generated code.

ocaml-ppx / ppxlib