ocaml-ppx / ppxlib

Base library and tools for ppx rewriters
MIT License
246 stars 98 forks source link

Ppxlib’s behaviour in case of raised exception. #448

Closed panglesd closed 9 months ago

panglesd commented 1 year ago

There has been several "private" discussions about ppxlib’s behaviour in case a transformation throws an exception. Those conversations lead to decisions, but the reasons behind the decisions were not always easy to find. This issue is an attempt at solving this!

Past and current behaviour

When I started working on ppxlib, when a transformation raised an exception, the whole AST was removed and replaced by a single item, an error extension node containing the exception.

This was not a problem for the compiler, which reports the first error it encounters and stops. However, it was a problem for Merlin, which had no way to "recover from the error". None of its features could work on such an empty AST (see also https://github.com/ocaml-ppx/ppxlib/issues/314).

So, we needed a way to recover from the error, to give a reasonable AST to Merlin. After some discussions, we implemented in https://github.com/ocaml-ppx/ppxlib/pull/315 the following behaviour:

This behaviour was discussed, also seeking advices from Merlin people. I think the reasonning behind this was to be able to give PPX authors the possibility to distinguish between a "hard" error, where continuing with the unrewritten AST would not make sense, and "recoverable" errors where it is possible to continue. Recoverable errors would be embedded manually in the AST by the rewriter.

However, there are problems with this approach:

For instance in:

module A = struct
  type t = [%ext failing] [@@deriving show]
end

module B = struct
  type t = A | B [@@deriving show]
  let f = print A
end

there would be the "Unbound value print" error, which is hard to link with the failure of expanding [%ext A].

New behaviour

After starting to patch some rewriters to have them embed recoverable errors, doubts with the current approach were raised. We discussed the approach, and think the following is better:

Continuing when a transformation has failed can lead to unrelated errors if some transformations depends on previous ones, but unrelated errors were also present in the previous approach. But the unrelated errors are at least all directly linked to the fact that a rewriter did not run. For instance:

module A = struct
  type t = [%ext failing] [@@deriving show]
end

module B = struct
  type t = A | B [@@deriving show]
  let f = print A
end

there would be the "Deriving show: Cannot derive extension points"", which is not ideal but understandable if [%ext failing] failed to derive.

Note that if we embed errors in the correct order, from the point of view of the compiler, nothing will change (except maybe a worse performance, see after.)

So:

Performance

Not failing early is important to have as much rewritten AST as possible, for Merlin to be able to work. However, for the compiler, it would be better to fail early, since only the first error is used.

I don’t know how much of a problem it is... Maybe, one possibility would be to fail early in the case of the compiler, and fail late in the case of Merlin. In fact, that might already be the case: when the -embed-errors option is not set, no exception is caught, and the rewriter fails early. But I don’t know if this option is commonly set by build systems! Edit: It seems that dune does not use that flag when running PPXs.

Progress

@Burnleydev1 has already made some progress on implementing the new behaviour:

NathanReb commented 9 months ago

I think this can now be closed!

panglesd commented 9 months ago

Sure!