Refactoring elab exception handling

Kha commented 1 week ago

The Lean.Exception type used during elaboration comes in two flavors: errors and internal exceptions

/-- Exception type used in most Lean monads -/
inductive Exception where
  /-- Error messages that are displayed to users. `ref` is used to provide position information. -/
  | error (ref : Syntax) (msg : MessageData)
  /--
  Internal exceptions that are not meant to be seen by users.
  Examples: "postpone elaboration", "stuck at universe constraint", etc.
  -/
  | internal (id : InternalExceptionId) (extra : KVMap := {})

At the time of writing, the internal exceptions are

builtin_initialize postponeExceptionId : InternalExceptionId ← registerInternalExceptionId `postpone
builtin_initialize unsupportedSyntaxExceptionId : InternalExceptionId ← registerInternalExceptionId `unsupportedSyntax
builtin_initialize abortCommandExceptionId : InternalExceptionId ← registerInternalExceptionId `abortCommandElab
builtin_initialize abortTermExceptionId : InternalExceptionId ← registerInternalExceptionId `abortTermElab
builtin_initialize abortTacticExceptionId : InternalExceptionId ← registerInternalExceptionId `abortTactic
builtin_initialize autoBoundImplicitExceptionId : InternalExceptionId ← registerInternalExceptionId `autoBoundImplicit
builtin_initialize isDefEqStuckExceptionId : InternalExceptionId ← registerInternalExceptionId `isDefEqStuck
builtin_initialize checkAssignmentExceptionId : InternalExceptionId ← registerInternalExceptionId `checkAssignment
builtin_initialize outOfScopeExceptionId : InternalExceptionId ← registerInternalExceptionId `outOfScope
builtin_initialize congrHypothesisExceptionId : InternalExceptionId ← registerInternalExceptionId `congrHypothesisFailed
builtin_initialize backtrackExceptionId : InternalExceptionId ← registerInternalExceptionId `backtrackFormatter
builtin_initialize delabFailureId : InternalExceptionId ← registerInternalExceptionId `delabFailure
builtin_initialize analyzeFailureId : InternalExceptionId ← registerInternalExceptionId `analyzeFailure

As documented, internal exceptions are mainly about visibility. Note however that this usage is not enforced by the API: as try catch etc catch any Exception, it is very easy to accidentally catch internal exceptions as well and e.g. add them to the log (where they will be printed as "internal exception: ...").

While errors can/should generally only be handled generically such as by putting them into the log, internal exceptions give an easy and efficient way to be handled case by case via the exception ID. An interesting question is whether they in turn should never be handled in a generic way; some names like abort* certainly imply that they should never be caught except by the specific code handling them, but for others we should make sure there are no cases where e.g. generic backtracking logic is supposed to catch them (very quick review: Term.observing and Elab.App make sure not to backtrack on generic internal exceptions but Elab.Match does not). If this is indeed the expectation, it would be good to ensure this in the API by leaving internal exception catching to catchInternalId(s).

To complicate matters, there is one more kind of exception used in the code: "runtime exceptions", which currently consist of heartbeat and max rec depth exceptions (see Exception.isRuntime). They are encoded as errors but a comment mentions this is mostly because of module dependencies. On the one hand, they are handled individually in some cases so having an internal exception ID would make sense; on the other hand, they are clearly user-visible. So it might be the case that this is better handled by giving errors an ID as well which we will want at some point for purposes of an error index.

One special handling of runtime exceptions is that they in fact are not caught by CoreM try-catch by default (but are in CommandElabM, unclear to me whether that inconsistency should exist). This behavior can be turned off with a flag but it turns out that both uses of this flag follow the exact same pattern:

     withCatchingRuntimeEx
       try
         withoutCatchingRuntimeEx do
           tryBranch
       catch ex =>
         if ex.isRuntime then
           runtimeBranch
         else
           throw ex

So as a first refactoring, the flag could be replaced with a new combinator, changing the code to

tryCatchRuntimeEx
  tryBranch
  fun ex => runtimeBranch

For context, this all came up because in #3940, I fixed elab interruption (checkInterrupted) to never be caught inside the elaborator, which raised the questions above for the existing exceptions.

In summary, possible refactoring steps to be done here:

[x] replace with(out)CatchingRuntimeEx with new tryCatchRuntimeEx combinator
[ ] (unclear) make try-catch ignore internal exceptions (and not just interrupt and runtime exceptions)
[ ] (unclear) make runtime exceptions internal exceptions

Vtec234 commented 1 week ago

that they in fact are not caught by CoreM try-catch by default (but are in CommandElabM, unclear to me whether that inconsistency should exist)

I have spent a good amount of time debugging an issue caused by this once. In that situation, I expected the CoreM try-catch to catch all exceptions and was surprised to find out it did not. The other way to ensure consistency would be to also skip runtime exceptions in CommandElabM try-catch, but I would personally prefer that they were always caught everywhere, and there was a separate catchEverythingButRuntimeAndInternal construct.

Kha commented 1 week ago

I think the main reason for not catching some exceptions by default is that it is just too easy to unintentionally misuse try-catch otherwise. If there is no reasonable default, it may be better just not to implement MonadExcept and provide only specifically named combinators. Out of interest, would it have helped you if the try-catch types would have been more clear about its behavior, e.g. if we had separate MonadExcepts for internal exceptions and errors?

Vtec234 commented 4 days ago

Out of interest, would it have helped you if the try-catch types would have been more clear about its behavior, e.g. if we had separate MonadExcepts for internal exceptions and errors?

My mistaken mental model was that try-catch catches everything. I suppose anything that makes it clear that that's not the case would have prevented the confusion. Out of the various possible clarifications, ones that encode that in the type system rather than just writing a docstring are probably preferrable since not many people will read the docstring for a well-understood construct such as exception handling.

leanprover / lean4

Refactoring elab exception handling #4079