dhall-lang / dhall-haskell

Maintainable configuration files
https://dhall-lang.org/
BSD 3-Clause "New" or "Revised" License
917 stars 213 forks source link

Provide non-truncated version in normal error messages #532

Open joneshf opened 6 years ago

joneshf commented 6 years ago

While working on defining Kan extensions and whatnot, some of the error messages are pretty hard to decipher unless the --explain flag is used. Compare

Without `--explain` ```Dhall $ dhall <<< ./Coyoneda/lower Use "dhall --explain" for detailed errors ↳ ./Coyoneda/lower Error: Expression doesn't match annotation ∀(… : ∀(… : …) → …) → ∀(… : - Type + { … : … } ) → ∀(… : - ∀(… : …) → … + Type ) → - … … + ∀(… : …) → … let lower : ∀(f : Type → Type) → ∀(a : Type) → Coyoneda f a → f a = λ(f : Type → Type) → λ(functor : Functor f) → λ(a : Type) → λ(coyoneda : Coyoneda f a) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) in lower ```
With `--explain` ```Dhall $ dhall --explain <<< ./Coyoneda/lower ↳ ./Coyoneda/lower Error: Expression doesn't match annotation ∀(… : ∀(… : …) → …) → ∀(… : - Type + { … : … } ) → ∀(… : - ∀(… : …) → … + Type ) → - … … + ∀(… : …) → … Explanation: You can annotate an expression with its type or kind using the ❰:❱ symbol, like this: ┌───────┐ │ x : t │ ❰x❱ is an expression and ❰t❱ is the annotated type or kind of ❰x❱ └───────┘ The type checker verifies that the expression's type or kind matches the provided annotation For example, all of the following are valid annotations that the type checker accepts: ┌─────────────┐ │ 1 : Natural │ ❰1❱ is an expression that has type ❰Natural❱, so the type └─────────────┘ checker accepts the annotation ┌───────────────────────┐ │ Natural/even 2 : Bool │ ❰Natural/even 2❱ has type ❰Bool❱, so the type └───────────────────────┘ checker accepts the annotation ┌────────────────────┐ │ List : Type → Type │ ❰List❱ is an expression that has kind ❰Type → Type❱, └────────────────────┘ so the type checker accepts the annotation ┌──────────────────┐ │ List Text : Type │ ❰List Text❱ is an expression that has kind ❰Type❱, so └──────────────────┘ the type checker accepts the annotation However, the following annotations are not valid and the type checker will reject them: ┌──────────┐ │ 1 : Text │ The type checker rejects this because ❰1❱ does not have type └──────────┘ ❰Text❱ ┌─────────────┐ │ List : Type │ ❰List❱ does not have kind ❰Type❱ └─────────────┘ Some common reasons why you might get this error: ● The Haskell Dhall interpreter implicitly inserts a top-level annotation matching the expected type For example, if you run the following Haskell code: ┌───────────────────────────────┐ │ >>> input auto "1" :: IO Text │ └───────────────────────────────┘ ... then the interpreter will actually type check the following annotated expression: ┌──────────┐ │ 1 : Text │ └──────────┘ ... and then type-checking will fail ──────────────────────────────────────────────────────────────────────────────── You or the interpreter annotated this expression: ↳ λ(f : Type → Type) → λ ( functor : ( λ(f : Type → Type) → { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) f ) → λ(a : Type) → λ ( coyoneda : ( λ(g : Type → Type) → λ(a : Type) → ∀(r : Type) → (∀(b : Type) → (b → a) → g b → r) → r ) f a ) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) ... with this type or kind: ↳ ∀(f : Type → Type) → ∀(a : Type) → (∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ... but the inferred type or kind of the expression is actually: ↳ ∀(f : Type → Type) → ∀ ( functor : { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) → ∀(a : Type) → ∀(coyoneda : ∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ──────────────────────────────────────────────────────────────────────────────── let lower : ∀(f : Type → Type) → ∀(a : Type) → Coyoneda f a → f a = λ(f : Type → Type) → λ(functor : Functor f) → λ(a : Type) → λ(coyoneda : Coyoneda f a) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) in lower ```

The problem is that the type given in the let binding is missing the Functor f, but it's really hard to discern that from the diff format used for mismatches. in particular, this bit really helped to understand the problem:

Expanded names ```Dhall You or the interpreter annotated this expression: ↳ λ(f : Type → Type) → λ ( functor : ( λ(f : Type → Type) → { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) f ) → λ(a : Type) → λ ( coyoneda : ( λ(g : Type → Type) → λ(a : Type) → ∀(r : Type) → (∀(b : Type) → (b → a) → g b → r) → r ) f a ) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) ... with this type or kind: ↳ ∀(f : Type → Type) → ∀(a : Type) → (∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ... but the inferred type or kind of the expression is actually: ↳ ∀(f : Type → Type) → ∀ ( functor : { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) → ∀(a : Type) → ∀(coyoneda : ∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ```

Seeing the full names allowed my brain to spot the difference that was harder to see with the truncated version.

Can we add this bit to the normal error messages? The diff is nice for "trivial" errors like Integer vs. Natural, or one level of record name mismatches. For anything "non-trivial," it's pretty hard to figure out the problem.

If not, is there another approach to having more understandable error messages?

Gabriella439 commented 6 years ago

I think the solution I would prefer is to make the diff more intelligent rather than include the original types. The reason for only using a diff in the first place is that the original types can get extremely large for some configuration file formats (like in dhall-to-cabal).

For example, the diff could conceivably look like this:

    ∀(f : …)
+ → ∀(functor : …)
  → ∀(a: …)
  → ∀(coyoneda: …)
  → …
joneshf commented 6 years ago

I'm down for it!

Gabriella439 commented 6 years ago

@joneshf: So the bad news is that we might not be able to easily preserve the variable names since the expressions are α-normalized before diffing. However, I still think we can more intelligently compute insertions

Also, could you give me a self-contained expression for your code that I can use for testing the displayed diff?

joneshf commented 6 years ago

Hmm, that's unfortunate. is it feasible to alpha-normalize while retaining the original name as metadata?

SCE The `Functor` is from https://github.com/FormationAI/dhall-bhat/blob/485c8f3e4d5397a36a194788f1507f8f0634ce16/Functor/Type ```Dhall let Functor = λ(f : Type → Type) → { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } in let Coyoneda = λ(f : Type → Type) → λ(a : Type) → ∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r in let lower : ∀(f : Type → Type) → ∀(a : Type) → Coyoneda f a → f a = λ(f : Type → Type) → λ(functor : Functor f) → λ(a : Type) → λ(coyoneda : Coyoneda f a) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) in lower ```
Gabriella439 commented 6 years ago

The main technical limitation is how to present the information to the user if the two types being compared use different variable names. For example:

let id : ∀(a : Type) → a → a → a = λ(b : Type) → λ(x : b) → x in id

The two types being compared would be:

∀(a : Type) → ∀(_ : a) → ∀(_ : a) → a

∀(b : Type) → ∀(x : b) → b

... and the only actual difference in the types is that the first type has one extra ∀(_ : a), so you could display:

    ∀(…)
  → ∀(…)
+ → ∀(…)
  → …

but if you want to also display variable names in the diff then you have to pick which type to take the variable names from. Do we show this:

    ∀(a : Type)
  → ∀(_ : a)
+ → ∀(_ : a)
  → a

... or this:

    ∀(b : Type)
  → ∀(x : b)
+ → ∀(_ : a)
  → b

You might think that we should pick the former diff since the latter diff is weird because the a comes out of nowhere. Fixing the latter diff to replace a with b so that it still makes sense is difficult to get correct.

However, if you choose to pick the former diff then you can just switch the type error in the opposite direction and get the exact same problem but now in reverse. So no matter which direction you prefer you end up with the same problem.

Profpatsch commented 6 years ago

How about only displaying the original name when both names match?

Gabriella439 commented 6 years ago

@Profpatsch: The problem with that is that you would also need to fix all references to that variable to match. For example, suppose that you compare these two types:

∀(a : Type) → ∀(b : Text) → a

∀(a : Type) → ∀(c : Bool) → a

... which respectively α-normalize to:

∀(_ : Type) → ∀(_ : Text) → _@1

∀(_ : Type) → ∀(_ : Bool) → _@1

The diff for what would be something like:

  ∀(_ : Type)
→ ∀(_ : - Text
        + Bool
  )
→ _

... but if you change the first _ to a, you would also need to fix the last variable, too:

  ∀(a : Type)
→ ∀(_ : - Text
        + Bool
  )
→ a
Profpatsch commented 6 years ago

That looks quite sensible to me though? If you mean that in the last example there’s too much noise in the diff, wasn’t that exactly what was originally requested by OP?

Gabriella439 commented 6 years ago

@Profpatsch: What I meant is that doing it in the sensible way is difficult to get correct

Gabriella439 commented 6 years ago

I just ran into another issue when attempting to implement my suggested fix. Consider a comparison of the following types:

  ∀(x : { foo : Text })
→ Natural
  ∀(x : { foo : Natural })
→ Natural

If you compare them the way I suggested, you would get:

- ∀(_ : { foo : Text })
+ ∀(_ : { foo : Natural })
→ Natural

... instead of the narrower diff of:

  ∀(_ : { foo : - Text
                + Natural
   )
→ Natural

I'll try to see if there's another approach to simplifying the diff

Gabriella439 commented 6 years ago

So I can simplify it a little bit by condensing the skeleton for function types:

  …
→ ∀(… : - Type
        + { … : … }
  )
→ ∀(… : - … → …
        + Type
  )
→ - … …
  + … → …

... and I'll put up a pull request for that

joneshf commented 6 years ago

Going back to the original problem, could there be another level of reporting?

Right now, there's:

  1. Normal - Shows the error, type diff, and original expression
  2. Explain - Shows the error, type diff, explanation, prose diff, and original expression

An alternate way to report could be:

  1. Minimal - Shows the error, type diff, and original expression
  2. Normal - Shows the error, type diff, prose diff, and original expression
  3. Explain - Shows the error, type diff, explanation, prose diff, and original expression

In the normal case–without any flags–you see the prose diff by default. Based on informal surveying, the prose diff is easier to understand; so, it seems like a good default. If you've a huge expression, you can tone it down. If you need more help, you can ask for the explanation.

I'm not committed to the names, nor suggesting flags. But, it feels like the two levels of reporting we have now aren't matching the uses in practice.

Could something like that work?

Gabriella439 commented 6 years ago

@joneshf: I'm not sure if the prose diff should be the default, though, because I believe we should optimize for configuration files with complex schemas where the concise diff is the clearer of the two error messages. For example, in dhall-to-cabal the prose explanation does not help because the user cannot easily scroll past the large types to read the text of the error message. The user also cannot visually compare the types to see what went wrong because the difference is a needle in a haystack.

We want to focus on the problem of managing large and complex configuration schemas because Dhall introduces added complexity and most people are only willing to tolerate that added complexity if they have an otherwise unmanageable configuration file. In other words, I propose that Dhall's tagline should be "When JSON or YAML is not an option".

If we optimize for simple configuration files then Dhall won't gain traction because there the marginal benefit of using Dhall over JSON doesn't justify the switching costs and complexity.

joneshf commented 6 years ago

I see. I think I'm coming at this from the perspective of using Dhall as a programming language (rather than a configuration language). In that setting, the type diff isn't quite as helpful. It's interesting (but not unexpected) that making things better for configuration and making things better for programming are at odds with each other.

What about the other part of that proposal: a third level of reporting?

Gabriella439 commented 6 years ago

@joneshf: Yeah, I understand that some people use Dhall as a "JSON for code" so that code can be transmitted and stored as a first-class value and I still try to support that use case

For the third level of reporting, the main issue is that the default level of reporting (i.e. Normal) still shouldn't show the prose diff by default because it would be too big for large configuration types.

joneshf commented 6 years ago

For the third level of reporting, the main issue is that the default level of reporting (i.e. Normal) still shouldn't show the prose diff by default because it would be too big for large configuration types.

Right, that's fine. I'm wondering if we can get a third level that isn't the default.

Gabriella439 commented 6 years ago

@joneshf: My reasoning was that if Minimal were the default then nobody would want to use Normal and would always jump straight to Explain

jneira commented 6 years ago

And what about include -error-* flags to enable/disable each of possible elements of the output: type diff(?), explanation, prose diff, and original expression (see #573)? It adds complexity to the command line interface, but it seems the possible outputs in a error are somewhat fixed. Or at least show the error at the beginning...

Gabriella439 commented 5 years ago

Revisiting this as I go through old issues:

@jneira: I'd prefer to not add detailed flags for customizing errors, for two main reasons: