Structured trace messages

leodemoura commented 7 years ago

We currently use trace messages to diagnose problems. Examples: why two terms didn't unify, why the simplifier didn't do what we wanted, etc. We have different trace message classes, and we enable them using the command set_option trace.trace_class true. We (developers and advanced users) spend a lot time looking at traces. The traces can be quite long, and they are just sequences of strings.

We want to add structured trace messages where Lean object can be stored. Moreover, trace objects would be reflected in Lean, and we would be able to write lean code for browsing, filtering and creating trace objects. We can see the following benefits:

We can write functions to browse the trace. Example: show me the detailed is_def_eq trace whenever argument Y of lemma X fails to unify.
We can modify/filter the trace before displaying it.
We can add explicit sections. This is useful when visualizing the trace produced by recursive procedures (e.g., is_def_eq, simp, etc).
We can pretty print and export the trace messages in different formats.

The plan is to start with type_context and simplifier modules.

leodemoura commented 7 years ago

@jroesch This RFC should address the simplifier diagnosis problem that we have been discussing.

gebner commented 7 years ago

The traces can be quite long, and they are just sequences of strings.

A further issue is that they are one string. There is really no way to separate one is_def_eq trace from the next, or even just from a class resolution trace.

We want to add structured trace messages where Lean object can be stored.

I'm not entirely clear on the lifecycle of the trace objects.

Do they only exist while the simplifier is running? And at the end, we convert them to messages? That is, you get them as an extra result from the simplifier?
Are they persisted into the log_tree, just like other error messages?
Or do we introduce another mechanism to report them?

Does it make sense to have a unified data structure for both trace messages and other error messages? Or do you want the trace objects to be free of any kind of formatting information, and just contain for example a list of expression pairs for the def_eq trace?

leodemoura commented 7 years ago

@gebner I am not very familiar with the log_tree code. However, it would be nice if we could store the trace objects there.

And at the end, we convert them to messages?

Yes, we would convert them to messages. We can have a new message that stores the new structured trace/error object.

Does it make sense to have a unified data structure for both trace messages and other error messages?

I think so. We also want structured error messages. BTW, I'm assuming we will have trace/error constructor such as trace.mk_node : name -> list trace -> trace.

Or do you want the trace objects to be free of any kind of formatting information, and just contain for example a list of expression pairs for the def_eq trace?

We should be able to attach string and format objects in trace messages. We can view them as "comments". It doesn't hurt.

gebner commented 7 years ago

@gebner I am not very familiar with the log_tree code. However, it would be nice if we could store the trace objects there.

We can basically store any kind of object there as long as they are subclasses of log_entry_cell. These objects are distinguished using dynamic_cast. Adding is as easy as logtree().add(entry). I want to expose the log_tree structure to the VM anyhow at some point so that we can read error messages.

leodemoura commented 7 years ago

@gebner What about the following basic structured message for traces and errors?

inductive message 
| with_ctx : context -> message -> message
| tag : name -> message -> message
| node : list message -> message
| from_expr : expr -> message
| from_level : level -> message
| from_format : format -> message
| from_tactic_state : tactic_state -> message
| from_name : name -> message

Where context is the triple (environment, metavar_ctx and local_ctx). The meaning of with_context ctx msg is that all expressions and levels occurring in msg should be interpreted with respect to ctx. For example, the trace message

[type_context.is_def_eq] ?m_1 =?= f a + 2 + 2 ... success  (approximate mode)

would be encoded as

message.with_ctx ctx $
  message.tag `type_context.is_def_eq $
  message.tag `success $
    message.node [
        message.from_expr (?m_1),
        message.from_expr (f a + 2 + 2)
    ]

leodemoura commented 7 years ago

The trace sequence

[type_context.is_def_eq_detail] [1]: f 0 =?= f a
[type_context.is_def_eq_detail] [2]: 0 =?= a
[type_context.is_def_eq_detail] on failure: 0 =?= a
[type_context.is_def_eq_detail] on failure: f 0 =?= f a
[type_context.is_def_eq] f 0 =?= f a ... failed  (approximate mode)

would be encoded as

message.with_ctx ctx $
   message.tag `type_context.is_def_eq $
   message.tag `failed $
   message.tag `type_context.is_def_eq_detail $
         message.node [
             message.from_expr (f 0),
             message.from_expr (f a),
             message.tag `type_context.is_def_eq_detail $
                 message.node [
                    message.from_expr 0,
                    message.from_expr a,
                    message.from_name `on_failure,
                 ],
                 message.from_name `on_failure
         ]
     ]

The message reflects more precisely how the predicate is_def_eq was computed.

leodemoura commented 7 years ago

For nice error/trace messages, I think we need a richer set of constructors for message. Example: the format primitives such as nest and group. It feels like message is like a small document that contains formatting data and meta data.

digama0 commented 7 years ago

If this is to be used for general message formatting, I strongly recommend of_string : string -> message, since not all message sources are structured and you need a fallback when you still want to deliver a string message.

EDIT: never mind, I guess you can still do it by using the of_format constructor.

leodemoura commented 7 years ago

@digama0 We can implement of_string using message.from_format.

johoelzl commented 7 years ago

Would a call to the simplifier with tracing output one message? Or would it output a message per rewrite?

If one message, then how do we handle a non-terminating case. It helps me to see at least a prefix of the messages to figure out where the tactic is heading to.

If we there are multiple messages (e.g. one per rewrite), I think it might be worth to carry depth information, or have some enter, leave, backtracking tags to figure out where the tactic currently is.

Edit: Ah, I see the comment on sections in the original comment...

jroesch commented 7 years ago

@leodemoura I really like the idea of providing messages as "mini-documents', containing both human readable content as well as program fragments. It seems like a good way to experiment with simp (and other automation) error reporting. It would be cool to mark a trace analyzer as an "error handler" for a particular tactic, with the handler producing a message from the trace instead of producing the default error.

My example is declaring a trace analyzer for simp roughly of the form:

-- We can optionally render a message, if we can produce something useful.
def simp_analyze :  trace_set -> option message := ...

@[error_handler] def my_simp_analyzer := { 
tac_name := `simp, 
handler := simp_analyze
}

lemma foo : 1 = 1 := by simp -- ERROR here

It seems like one possibility is just adding this behavior directly to simp with an attribute, the ability to query the trace set, and render an error message. Just a thought, maybe you have something similar or better in mind.

gebner commented 7 years ago

If one message, then how do we handle a non-terminating case.

Just as we do now, via timeouts. Right now, we only report trace messages at the end of a proof either.

@[error_handler] def my_simp_analyzer := {

Not saying that this is a bad idea, but such a definition will affect every module that imports it.

leanprover / lean3

Structured trace messages #1692