Replace logging infrastructure

BinderDavid commented 4 months ago

Output currently looks like this:

>  RUST_LOG=trace pol check examples/example.pol
[TRACE]  |- head =>
[TRACE] [[]] |- Stream ▷ Stream
[TRACE] ↓Stream ~> Stream
[TRACE] [[],[Stream]] |- Nat => Type
[TRACE]  |- add =>
[TRACE] [[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[y@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[Nat]] |- Nat => Type
[TRACE] [[y@1.0],[@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat]] |- Z => y <= Nat
[TRACE] [[y@1.0],[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[y@1.0],[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[]] |- y <= Nat
[TRACE] Nat =? Nat
[TRACE] [[Nat]] |- S(x') => S(x'.add(y)) <= Nat
[TRACE] [[Nat],[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[y@1.0],[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[Nat]] |- S(x'.add(y)) <= Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[Nat]] |- x'.add(y) <= Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[Nat]] |- y <= Nat
[TRACE] Nat =? Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat],[Nat]] |- x' <= Nat
[TRACE] Nat =? Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] Nat =? Nat
[TRACE] [[y@1.0],[x'@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] Nat =? Nat
[TRACE]  |- ap =>
[TRACE] [[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[x@0.0]] |- NatToNat ▷ NatToNat
[TRACE] ↓NatToNat ~> NatToNat
[TRACE] [[Nat],[NatToNat]] |- Nat => Type
[TRACE]  |- Z =>
[TRACE]  |- Nil =>
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[]] |- Z <= Nat
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] Nat =? Nat
[TRACE]  |- Nat =>
[TRACE]  |- S =>
[TRACE] [[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE]  |- Cons =>
[TRACE] [[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[n@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat,Nat]] |- Vec(n) <= Type
[TRACE] [[n@0.1,x@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat,Nat]] |- n <= Nat
[TRACE] Nat =? Nat
[TRACE] Type =? Type
[TRACE] [[n@0.1,x@0.0]] |- n ▷ n@0.1
[TRACE] [[n@0.1,x@0.0]] |- Vec(n) ▷ Vec(n@0.1)
[TRACE] ↓n@0.1 ~> n
[TRACE] ↓Vec(n@0.1) ~> Vec(n)
[TRACE] [[n@0.2,x@0.1,xs@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat,Nat,Vec(n)]] |- S(n) <= Nat
[TRACE] [[n@0.2,x@0.1,xs@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] [[Nat,Nat,Vec(n)]] |- n <= Nat
[TRACE] Nat =? Nat
[TRACE] [[n@0.2,x@0.1,xs@0.0]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE] Nat =? Nat
[TRACE]  |- Vec =>
[TRACE] [[]] |- Nat <= Type
[TRACE] Type =? Type
[TRACE] [[]] |- Nat ▷ Nat
[TRACE] ↓Nat ~> Nat
[TRACE]  |- tail =>
[TRACE] [[]] |- Stream ▷ Stream
[TRACE] ↓Stream ~> Stream
[TRACE] [[],[Stream]] |- Stream => Type
[TRACE]  |- Stream =>
[TRACE]  |- NatToNat =>

This is configured via:

env_logger::builder().format_timestamp(None).format_target(false).init();

If we omit the format_target(false), then it also prints the module where the trace macro was called:

[TRACE elaborator::typechecker::decls]  |- head =>
[TRACE elaborator::normalizer::eval] [[]] |- Stream ▷ Stream
[TRACE elaborator::normalizer::val] ↓Stream ~> Stream
[TRACE elaborator::typechecker::typecheck] [[],[Stream]] |- Nat => Type
[TRACE elaborator::typechecker::decls]  |- add =>
[TRACE elaborator::typechecker::typecheck] [[]] |- Nat <= Type
[TRACE elaborator::typechecker::util] Type =? Type
[TRACE elaborator::normalizer::eval] [[]] |- Nat ▷ Nat
[TRACE elaborator::normalizer::val] ↓Nat ~> Nat
[TRACE elaborator::normalizer::val] ↓Nat ~> Nat
[TRACE elaborator::normalizer::eval] [[y@0.0]] |- Nat ▷ Nat
[TRACE elaborator::normalizer::val] ↓Nat ~> Nat
[TRACE elaborator::typechecker::typecheck] [[Nat],[Nat]] |- Nat => Type
[TRACE elaborator::normalizer::eval] [[y@1.0],[@0.0]] |- Nat ▷ Nat
[TRACE elaborator::normalizer::val] ↓Nat ~> Nat
[TRACE elaborator::typechecker::typecheck] [[Nat]] |- Z => y <= Nat
[TRACE elaborator::normalizer::eval] [[y@1.0],[]] |- Nat ▷ Nat

We can also omit the tracing level, i.e. the leading TRACE.

BinderDavid commented 4 months ago

We get a lot of stuff already built in. For example, we can filter out only the relevant trace output from modules we are interested in: https://docs.rs/env_logger/latest/env_logger/ We just have to use different arguments to RUST_LOG. We can use this to add more fine-grained traces and filter out those which are not interesting.

timsueberkrueb commented 4 months ago

Let's discuss the requirements of the tracing feature. My main requirements are the following:

Inspect the type checking steps that have been taken when checking a program (incl. normalization, unification)
Display the De-Bruijn indices for all variables
The output should be human-readable (e.g. colorized, indented)
The feature should work even if the application panics
Minimize cluttering the domain logic with tracing code

My main worries about the approach in this PR are 3 and 5. These were the reasons I went with a custom trace macro in the first place. In particular, we lost colorized output and gained clutter.

For instance, with the custom tracing infrastructure, we could have indented and colorized output like the following (SLTC for simplicity):

[ ] Γ |- f x => ?
 |  Γ |- f => A -> B
 |  Γ |- x  => A
 |------------------- ( -> )-Elim
 |  Γ |- f x  => B
[x]

where the trace macro would keep track of the indentation (i.e. the position in the call stack). Further, trace effectively hides the tracing code from the domain logic. Both seems hard to do without a custom implementation.

However, notice that the requirements could also be fulfilled by other means such as:

Extending the language server with a De-Bruijn mode and catching panics
Being able to generate a (Latex) report with typing derivations (note that dealing with panics complicates this)

However, I understand that there are additional requirements not satisfied by the current implementation:

Print additional debug information that would not match this format
Filter debug output
Keep the code easy to understand and hence minimize custom macro magic

Am I missing any additional aspects? How do you envision the debug output to look like? Which kinds of filtering do we need? Which additional output do we need?

BinderDavid commented 4 months ago

I think colored and indented output shouldn't be a problem. I probably just have to replace print_to_string by print_colored. The env_logger also has options for multiline logs: https://docs.rs/env_logger/latest/env_logger/struct.Builder.html#method.format_indent

The main constraint that the current implementation does not satisfy for me is that it tries to separate the tracing from the domain logic. Just tracing at the function boundary does not work: Being able to add to the log from arbitrary positions inside the logic is essential. For example, I want to emit something to the trace whenever a metavar has been solved, and I have to do this from deep within the unify function.

BinderDavid commented 4 months ago

I have removed the [Trace] prefix since you don't like clutter :) It now looks like the previous implementation, except for the colors:

 |- head =>
[[]] |- Stream ▷ Stream
↓Stream ~> Stream
[[],[Stream]] |- Nat => Type
 |- add =>
[[]] |- Nat <= Type
Type =? Type
[[]] |- Nat ▷ Nat
↓Nat ~> Nat
↓Nat ~> Nat
[[y@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
[[Nat],[Nat]] |- Nat => Type
[[y@1.0],[@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
[[Nat]] |- Z => y <= Nat
[[y@1.0],[]] |- Nat ▷ Nat
↓Nat ~> Nat
[[y@1.0],[]] |- Nat ▷ Nat
↓Nat ~> Nat
[[Nat],[]] |- y <= Nat
Nat =? Nat
[[Nat]] |- S(x') => S(x'.add(y)) <= Nat
[[Nat],[]] |- Nat <= Type
Type =? Type
[[y@1.0],[]] |- Nat ▷ Nat
↓Nat ~> Nat
[[y@1.0],[x'@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
[[y@1.0],[x'@0.0]] |- Nat ▷ Nat

So coming back to the discussion: My two central requirements are:

Allow to add to the trace from within arbitrary positions in the code.
Allow to filter out events that are not interesting to me.

Point 1) is essential because I need to trace different things in different parts of a conditional. One example would be to register every time we have mutated some global state, such as solving a metavar. It is also essential because I can add temporary information to the trace during a debug session without having to change the architecture of the code. Point 2) is important because I want to add tracing to other parts of the compiler as well, such as lowering, but I want to have an easy way to filter out events that are not interesting to me.

BinderDavid commented 4 months ago

[ ] Γ |- f x => ?
 |  Γ |- f => A -> B
 |  Γ |- x  => A
 |------------------- ( -> )-Elim
 |  Γ |- f x  => B
[x]

That would be super rad :) But I think we can implement this using a combination of for example askama templates and the trace!() macro. But you cannot implement this in general at the function boundary. As a simple example, take the implementation of CheckInfer for constructors. We can only print a nice typing rule after we have done the lookup of the signature of the constructor in the typing context. So the code could look something like this:

impl CheckInfer for Call {
    /// The *checking* rule for calls is:
    /// ```text
    ///                 ...
    ///           ──────────────────
    ///            P, Γ ⊢ Cσ ⇐ τ
    /// ```
    fn check(&self, prg: &Module, ctx: &mut Ctx, t: Rc<Exp>) -> Result<Self, TypeError> {
       ...
    }
    /// The *inference* rule for calls is:
    /// ```text
    ///                 ...
    ///           ──────────────────
    ///            P, Γ ⊢ Cσ ⇒ ...
    /// ```
    fn infer(&self, prg: &Module, ctx: &mut Ctx) -> Result<Self, TypeError> {
        let Call { span, kind, name, args, .. } = self;

        match kind {
            CallKind::Codefinition | CallKind::Constructor => {
                let Ctor { name, params, typ, .. } = &prg.ctor_or_codef(name, *span)?;

                trace!(ctor_trace_template(name, args, params, typ))   <<------!

                let args_out = check_args(args, prg, name, ctx, params, *span)?;
                let typ_out = typ
                    .subst_under_ctx(vec![params.len()].into(), &vec![args.args.clone()])
                    .to_exp();
                let typ_nf = typ_out.normalize(prg, &mut ctx.env())?;
                Ok(Call {
                    span: *span,
                    kind: *kind,
                    name: name.clone(),
                    args: args_out,
                    inferred_type: Some(typ_nf),
                })
            }

BinderDavid commented 4 months ago

I just sketched what I mean in 9d695f1 If you don't like it at all I can revert the commit :)

The trace looks like this:


[[Nat],[Nat]] |- y <= Nat
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Var
    [[Nat],[Nat]] ⊢ y : Nat
Nat =? Nat
[[y@1.0],[x'@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
[[Nat],[Nat]] |- x' <= Nat
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Var
    [[Nat],[Nat]] ⊢ x' : Nat
Nat =? Nat
[[y@1.0],[x'@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
Nat =? Nat
[[y@1.0],[x'@0.0]] |- Nat ▷ Nat
↓Nat ~> Nat
Nat =? Nat

Obviously still needs appropriate spacing, etc. But what do you think about the general approach?

timsueberkrueb commented 4 months ago

Sounds good :+1:. If Askama is not sufficient, we could also write a custom macro (or changing the previous macro from an attribute macro to a function-like macro). E.g. we probably can't control the length of the inference line, but that also wouldn't bother me too much. What I originally had in mind was indenting sub-derivations and thereby generating linearized, sort of flag-style nested typing derivations, but that might get unreadable fairly quickly. This is an example for what I originally had in mind:

[ ] Γ |- (id f) x => ?
 | 
 |  [ ] Γ |- (id f) => ?
 |   |  Γ |- id => (A -> B) -> (A -> B)
 |   |  Γ |- f: A -> B
 |   | ------------------------ ( -> )-Elim
 |  [x] Γ |- (id f) => A -> B
 |
 | Γ |- f => A -> B
 | Γ |- x  => A
 |
 |------------------- ( -> )-Elim
[x] Γ |- f x  => B

That would be possible with the previous mechanism by tracking the indentation level in global mutable state, but the readability might degrade quickly for larger programs and it doesn't interact well with other types of log messages.

timsueberkrueb commented 4 months ago

Anyway, all of this goes way beyond to what we had before, so if you fix the colors, that is fine with me.

BinderDavid commented 4 months ago

That would be possible with the previous mechanism by tracking the indentation level in global mutable state, but the readability might degrade quickly for larger programs and it doesn't interact well with other types of log messages.

I think we can implement this in a very light-weight manner on top of the log infrastructure from the Rust ecosystem. What reusing the log infrastructure gives us for free is very expressive facilities for tagging and filtering log messages, as well as a separation of the mechanisms for emitting and collecting them (I.e. log vs env_logger). For example, we can have a global mutable variable (for example of type String, to begin with) which we set whenever we enter a toplevel declaration. Then we prefix every log message with the name of the toplevel declaration. For tracking the structure of expressions we could do something similar if we need it.

BinderDavid commented 4 months ago

@timsueberkrueb I implemented the color printing in a8416ee Better only take a look at this specific commit. The commits inbetween only split the huge typechecking module into individual modules for each syntax node. This makes it possible to filter the log according to the typing rule that you are interested in, because we can filter according to the modules which generate the trace messages.

timsueberkrueb commented 4 months ago

Looking good. The output of the Var rule looks a bit out of place though (because it's the only thing printed as a derivation rule and there's no spacing.

Example:

↓Nat ~> Nat
[[Nat],[Nat]] |- y <= Nat
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Var
    [[Nat],[Nat]] ⊢ y : Nat
Nat =? Nat

timsueberkrueb commented 4 months ago

So maybe remove Askama again for now and leave that for future work.

BinderDavid commented 4 months ago

So maybe remove Askama again for now and leave that for future work.

I have reverted the commit :)

polarity-lang / polarity

Replace logging infrastructure #233