haskell / error-messages

73 stars 18 forks source link

Negative reasoning for type classes to make better errors #44

Open Ericson2314 opened 2 years ago

Ericson2314 commented 2 years ago

Rust has skirted the edge with various aspects of type classes. In particular, per https://github.com/Ixrec/rust-orphan-rules/issues/1 there have been various explorations of "negative reasoning" and revises orphan rules. (Rust doesn't allow orphans, and uses its orphans to ensure there are no overlaps (except for "specialization" but let's ignore that for now).

One big idea here is that while regular contexts depending on "anti-constraints" is pretty bogus (except for perhaps not equal constraints), there is no reason while top level assertions, that are completely erased, cannot be negative. Indeed, orphan and overlap rules can be phrased in terms somewhat like that (if condition....something is not allowed).

Why do I bring this up? I am thinking it may be useful to make errors better without baking in ad-hoc logic into GHC or removing expressive power.

For example, inspired by the recent https://www.thecodedmessage.com/posts/haskell-gripe/, take

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE OverloadedLists #-}
1 + [1]

this program is deeply unconstrained, and could perhaps be well typed in all sorts of stupid ways. But if we could write a declaration like

assert !(Num a, IsList a)

to assert no type has both number and list literals, we can in fact rule out the program, with a nice error, e.g something like

No type 'a' satisfying 
  1 :: a
  [1] :: a
  (+) :: a -> a -> a
in expression
  1 + [1]
no types are both list-like and number-like

Besides trying to keep brittle ad-hoc logic out of GHC, I think it quite a subjective matter whether one should provide assertions like this---I would not put such assertions in base. Instead, I would advocate they be just in custom preludes, especially for teaching. This would be a small step towards "teaching languages".

Note that since assert !(Num a, IsList a) is erased / has no computation interpretation, there is no risk of programs "relying" on this in any ways. It simply makes more programs ill-typed. Instances are always exported, but assertions like this can safely not be. Teaching prelude modules would of course export them for downstream student use, but any user is free to write their own that just effects the local module and nothing else.

Ericson2314 commented 2 years ago

Perhaps an equivalent way to break down this feature is to say that instances with "type error" in their contexts are allowed to overlap. Then

assert !(Num a, IsList a)

is quite close to

instance TypeError (Text "no type is both a number and a list")
  => (,) (Num a) (IsList a)

(I curry (,) to make it fit a single head, but we could finesse that. Arguably tuple heads are falsely maligned as invalid syntax when they are merely overlapping rules, but that's a separate conversation.)

adamgundry commented 2 years ago

See also #42. The consensus there seems to be that forbidding specific instances in base by defining "negative instances" using TypeError in the context is not the way to go. That said, I think you make a good point:

Besides trying to keep brittle ad-hoc logic out of GHC, I think it quite a subjective matter whether one should provide assertions like this---I would not put such assertions in base. Instead, I would advocate they be just in custom preludes, especially for teaching.

I suppose it would not be that hard to provide a library consisting solely of negative (orphan) instances, so that the user can opt in to them. That avoids the "but what if someone really wants a Num String instance" class of arguments, because they aren't mandatory as part of base. But it does raise questions about how new users discover the existence of this library. (And if there are multiple such libraries, they might both define orphan instances for the same class/type.)

The other interesting idea here is allowing negative assertions that are strictly more powerful than existing instances, e.g. ruling out the combination of Num and IsList being instantiated. I think it would be challenging to extend the constraint solver in general to deal with such instances consistently with its existing behaviour, even if it is assumed there are type errors in the context. But perhaps when GHC is reporting a set of unsolved constraints, it could try to match whichever assertions are in scope to simplify the reported errors. This might allow more general forms of matching on constraints than are currently permitted in instances, for example to tackle ambiguity as well.

I'm imagining something like this:

report (Num a, IsList a) = Text "no type is both a number and a list"

where report is a keyword, the thing on the LHS is some kind of "constraint pattern" (which might involve things like asserting certain variables are ambiguous, and perhaps distinguishing wanted/given constraints?), and the thing on the RHS is an ErrorMessage (perhaps allowing some computation to determine the message). This wouldn't necessarily imply that the constraints on the LHS are always unsolvable; rather it would give a way to report them if they turned out to be unsolved in a particular case. (An open question is how to prioritize such rules when there are multiple matches...)

A real-world example where something like this could be helpful is https://github.com/well-typed/optics/issues/308, where it would be very nice to be able to add domain-specific messages when certain constraints are unsolved due to ambiguity, even though the constraints are not in general impossible.

Ericson2314 commented 2 years ago

Yes great points @adamgundry. I also find it more interesting that your Unsatisfiable improves the operational semantics, whereas with these assertions/reports, I think we don't want any (though I haven't thought about -fdefered-type-errors too much).

Perhaps the bad TypeError is forking into two replacements each of which is a better fit for what it's used for? :)

Ericson2314 commented 2 years ago

Also, I hope our handling of things like Int ~ () can be partially subsumed into such a model assertion framework. It's quite similar, except for the fact that we might not want to go as far as influencing the notion of dead code, e.g. for pattern matching on

data BetterStillBeInhabited where
  MkBetterStillBeInhabited :: (Num a, Show a) => BetterStillBeInhabited
goldfirere commented 2 years ago

This thread feels very speculative to me -- well beyond the goal of this repo, which is about improving GHC's error messages in a way we can action shortly. A general mechanism for how users can control error messages would indeed be great, but may I politely request that this desire not derail the more practical work being debated in this repo?

Ericson2314 commented 2 years ago

@goldfirere I think there is a legitimate difference of opinion:

In terms of cost, this seems not that different from orphan instances with type errors to me, and thus not hard to implement. Conversely, I am still wary of base-only hacks as discussed in #5 and there their, for once we have the features need, it is much easier to add, modify, and generally maintain custom messages in libraries than in GHC.

A concrete example is what if we do add the "strings are not numbers" ad-hoc logic. Simultaneously, there is a bunch of interest in making Text the default, steering users away from String, etc. So we do that, and then someone rightly complains quality of error messages have regressed. Now what? Do add wired-in symbols for Text? Does text become a "boot" package? Do big changes like UTF-8 text harder become harder to backport to older GHCs?

Every coupling between GHCs and library definitions imposed a large burden upon our agility going forward, and I would be loathe to see more of them arise when an alternative solution seems so close within reach

goldfirere commented 2 years ago

I'm not against what you're proposing. I'm just saying that this proposal doesn't, in my opinion, fit with the spirit of this repo. Your approach would need a thorough design, a GHC proposal, and a non-trivial implementation. I would be surprised if all that could be accomplished in less than a year. On the other hand, this repo is about improving the text of error messages that could be implemented in an hour. It is true that work being done here might move us into a local maximum that could reduce our agility. But the amount by which it would reduce our agility would be minimal, and my experience with GHC is that large proposals take a long time, by which we will have confused more people. Instead, I wish to stop confusing people today -- and then hope to have a general approach that would confuse even fewer people tomorrow.

Ericson2314 commented 2 years ago

I started to write up basically what you said in https://github.com/haskell/error-messages/issues/5#issuecomment-1047072060 . If couple any ad-hoc thing with a commitment (ideally HF budget for implementation, if HF were to fund the temporary solution) to circle back and do the more general thing later, that relieves my concerns enough that I'll stop complaining.