Open joneshf opened 6 years ago
I think the solution I would prefer is to make the diff more intelligent rather than include the original types. The reason for only using a diff in the first place is that the original types can get extremely large for some configuration file formats (like in dhall-to-cabal
).
For example, the diff could conceivably look like this:
∀(f : …)
+ → ∀(functor : …)
→ ∀(a: …)
→ ∀(coyoneda: …)
→ …
I'm down for it!
@joneshf: So the bad news is that we might not be able to easily preserve the variable names since the expressions are α-normalized before diffing. However, I still think we can more intelligently compute insertions
Also, could you give me a self-contained expression for your code that I can use for testing the displayed diff?
Hmm, that's unfortunate. is it feasible to alpha-normalize while retaining the original name as metadata?
The main technical limitation is how to present the information to the user if the two types being compared use different variable names. For example:
let id : ∀(a : Type) → a → a → a = λ(b : Type) → λ(x : b) → x in id
The two types being compared would be:
∀(a : Type) → ∀(_ : a) → ∀(_ : a) → a
∀(b : Type) → ∀(x : b) → b
... and the only actual difference in the types is that the first type has one extra ∀(_ : a)
, so you could display:
∀(…)
→ ∀(…)
+ → ∀(…)
→ …
but if you want to also display variable names in the diff then you have to pick which type to take the variable names from. Do we show this:
∀(a : Type)
→ ∀(_ : a)
+ → ∀(_ : a)
→ a
... or this:
∀(b : Type)
→ ∀(x : b)
+ → ∀(_ : a)
→ b
You might think that we should pick the former diff since the latter diff is weird because the a
comes out of nowhere. Fixing the latter diff to replace a
with b
so that it still makes sense is difficult to get correct.
However, if you choose to pick the former diff then you can just switch the type error in the opposite direction and get the exact same problem but now in reverse. So no matter which direction you prefer you end up with the same problem.
How about only displaying the original name when both names match?
@Profpatsch: The problem with that is that you would also need to fix all references to that variable to match. For example, suppose that you compare these two types:
∀(a : Type) → ∀(b : Text) → a
∀(a : Type) → ∀(c : Bool) → a
... which respectively α-normalize to:
∀(_ : Type) → ∀(_ : Text) → _@1
∀(_ : Type) → ∀(_ : Bool) → _@1
The diff for what would be something like:
∀(_ : Type)
→ ∀(_ : - Text
+ Bool
)
→ _
... but if you change the first _
to a
, you would also need to fix the last variable, too:
∀(a : Type)
→ ∀(_ : - Text
+ Bool
)
→ a
That looks quite sensible to me though? If you mean that in the last example there’s too much noise in the diff, wasn’t that exactly what was originally requested by OP?
@Profpatsch: What I meant is that doing it in the sensible way is difficult to get correct
I just ran into another issue when attempting to implement my suggested fix. Consider a comparison of the following types:
∀(x : { foo : Text })
→ Natural
∀(x : { foo : Natural })
→ Natural
If you compare them the way I suggested, you would get:
- ∀(_ : { foo : Text })
+ ∀(_ : { foo : Natural })
→ Natural
... instead of the narrower diff of:
∀(_ : { foo : - Text
+ Natural
)
→ Natural
I'll try to see if there's another approach to simplifying the diff
So I can simplify it a little bit by condensing the skeleton for function types:
…
→ ∀(… : - Type
+ { … : … }
)
→ ∀(… : - … → …
+ Type
)
→ - … …
+ … → …
... and I'll put up a pull request for that
Going back to the original problem, could there be another level of reporting?
Right now, there's:
An alternate way to report could be:
In the normal case–without any flags–you see the prose diff by default. Based on informal surveying, the prose diff is easier to understand; so, it seems like a good default. If you've a huge expression, you can tone it down. If you need more help, you can ask for the explanation.
I'm not committed to the names, nor suggesting flags. But, it feels like the two levels of reporting we have now aren't matching the uses in practice.
Could something like that work?
@joneshf: I'm not sure if the prose diff should be the default, though, because I believe we should optimize for configuration files with complex schemas where the concise diff is the clearer of the two error messages. For example, in dhall-to-cabal
the prose explanation does not help because the user cannot easily scroll past the large types to read the text of the error message. The user also cannot visually compare the types to see what went wrong because the difference is a needle in a haystack.
We want to focus on the problem of managing large and complex configuration schemas because Dhall introduces added complexity and most people are only willing to tolerate that added complexity if they have an otherwise unmanageable configuration file. In other words, I propose that Dhall's tagline should be "When JSON or YAML is not an option".
If we optimize for simple configuration files then Dhall won't gain traction because there the marginal benefit of using Dhall over JSON doesn't justify the switching costs and complexity.
I see. I think I'm coming at this from the perspective of using Dhall as a programming language (rather than a configuration language). In that setting, the type diff isn't quite as helpful. It's interesting (but not unexpected) that making things better for configuration and making things better for programming are at odds with each other.
What about the other part of that proposal: a third level of reporting?
@joneshf: Yeah, I understand that some people use Dhall as a "JSON for code" so that code can be transmitted and stored as a first-class value and I still try to support that use case
For the third level of reporting, the main issue is that the default level of reporting (i.e. Normal) still shouldn't show the prose diff by default because it would be too big for large configuration types.
For the third level of reporting, the main issue is that the default level of reporting (i.e. Normal) still shouldn't show the prose diff by default because it would be too big for large configuration types.
Right, that's fine. I'm wondering if we can get a third level that isn't the default.
@joneshf: My reasoning was that if Minimal were the default then nobody would want to use Normal and would always jump straight to Explain
And what about include -error-*
flags to enable/disable each of possible elements of the output: type diff(?), explanation, prose diff, and original expression (see #573)?
It adds complexity to the command line interface, but it seems the possible outputs in a error are somewhat fixed.
Or at least show the error at the beginning...
Revisiting this as I go through old issues:
@jneira: I'd prefer to not add detailed flags for customizing errors, for two main reasons:
Very few people will use them (or even know they exist) and they will increase our maintenance burden to support them
They don't accurately preserve user intent, so we can't intelligently modify their behavior because we don't know what problem the user wanted to solve the first place. For example, the intent behind --explain
is "show me as much information as possible". I would prefer flags that capture more high-level user intentions like --pretty
or --compact
rather than --error-type-diff --error-no-context
While working on defining Kan extensions and whatnot, some of the error messages are pretty hard to decipher unless the
--explain
flag is used. CompareWithout `--explain`
```Dhall $ dhall <<< ./Coyoneda/lower Use "dhall --explain" for detailed errors ↳ ./Coyoneda/lower Error: Expression doesn't match annotation ∀(… : ∀(… : …) → …) → ∀(… : - Type + { … : … } ) → ∀(… : - ∀(… : …) → … + Type ) → - … … + ∀(… : …) → … let lower : ∀(f : Type → Type) → ∀(a : Type) → Coyoneda f a → f a = λ(f : Type → Type) → λ(functor : Functor f) → λ(a : Type) → λ(coyoneda : Coyoneda f a) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) in lower ```With `--explain`
```Dhall $ dhall --explain <<< ./Coyoneda/lower ↳ ./Coyoneda/lower Error: Expression doesn't match annotation ∀(… : ∀(… : …) → …) → ∀(… : - Type + { … : … } ) → ∀(… : - ∀(… : …) → … + Type ) → - … … + ∀(… : …) → … Explanation: You can annotate an expression with its type or kind using the ❰:❱ symbol, like this: ┌───────┐ │ x : t │ ❰x❱ is an expression and ❰t❱ is the annotated type or kind of ❰x❱ └───────┘ The type checker verifies that the expression's type or kind matches the provided annotation For example, all of the following are valid annotations that the type checker accepts: ┌─────────────┐ │ 1 : Natural │ ❰1❱ is an expression that has type ❰Natural❱, so the type └─────────────┘ checker accepts the annotation ┌───────────────────────┐ │ Natural/even 2 : Bool │ ❰Natural/even 2❱ has type ❰Bool❱, so the type └───────────────────────┘ checker accepts the annotation ┌────────────────────┐ │ List : Type → Type │ ❰List❱ is an expression that has kind ❰Type → Type❱, └────────────────────┘ so the type checker accepts the annotation ┌──────────────────┐ │ List Text : Type │ ❰List Text❱ is an expression that has kind ❰Type❱, so └──────────────────┘ the type checker accepts the annotation However, the following annotations are not valid and the type checker will reject them: ┌──────────┐ │ 1 : Text │ The type checker rejects this because ❰1❱ does not have type └──────────┘ ❰Text❱ ┌─────────────┐ │ List : Type │ ❰List❱ does not have kind ❰Type❱ └─────────────┘ Some common reasons why you might get this error: ● The Haskell Dhall interpreter implicitly inserts a top-level annotation matching the expected type For example, if you run the following Haskell code: ┌───────────────────────────────┐ │ >>> input auto "1" :: IO Text │ └───────────────────────────────┘ ... then the interpreter will actually type check the following annotated expression: ┌──────────┐ │ 1 : Text │ └──────────┘ ... and then type-checking will fail ──────────────────────────────────────────────────────────────────────────────── You or the interpreter annotated this expression: ↳ λ(f : Type → Type) → λ ( functor : ( λ(f : Type → Type) → { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) f ) → λ(a : Type) → λ ( coyoneda : ( λ(g : Type → Type) → λ(a : Type) → ∀(r : Type) → (∀(b : Type) → (b → a) → g b → r) → r ) f a ) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) ... with this type or kind: ↳ ∀(f : Type → Type) → ∀(a : Type) → (∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ... but the inferred type or kind of the expression is actually: ↳ ∀(f : Type → Type) → ∀ ( functor : { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) → ∀(a : Type) → ∀(coyoneda : ∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ──────────────────────────────────────────────────────────────────────────────── let lower : ∀(f : Type → Type) → ∀(a : Type) → Coyoneda f a → f a = λ(f : Type → Type) → λ(functor : Functor f) → λ(a : Type) → λ(coyoneda : Coyoneda f a) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) in lower ```The problem is that the type given in the let binding is missing the
Functor f
, but it's really hard to discern that from the diff format used for mismatches. in particular, this bit really helped to understand the problem:Expanded names
```Dhall You or the interpreter annotated this expression: ↳ λ(f : Type → Type) → λ ( functor : ( λ(f : Type → Type) → { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) f ) → λ(a : Type) → λ ( coyoneda : ( λ(g : Type → Type) → λ(a : Type) → ∀(r : Type) → (∀(b : Type) → (b → a) → g b → r) → r ) f a ) → coyoneda (f a) (λ(b : Type) → λ(k : b → a) → λ(x : f b) → functor.map b a k x) ... with this type or kind: ↳ ∀(f : Type → Type) → ∀(a : Type) → (∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ... but the inferred type or kind of the expression is actually: ↳ ∀(f : Type → Type) → ∀ ( functor : { map : ∀(a : Type) → ∀(b : Type) → ∀(g : a → b) → ∀(fa : f a) → f b } ) → ∀(a : Type) → ∀(coyoneda : ∀(r : Type) → (∀(b : Type) → (b → a) → f b → r) → r) → f a ```Seeing the full names allowed my brain to spot the difference that was harder to see with the truncated version.
Can we add this bit to the normal error messages? The diff is nice for "trivial" errors like
Integer
vs.Natural
, or one level of record name mismatches. For anything "non-trivial," it's pretty hard to figure out the problem.If not, is there another approach to having more understandable error messages?