josefs / Gradualizer

A Gradual type system for Erlang
MIT License
613 stars 35 forks source link

Non-exhaustive matches in case expressions #140

Closed erszcz closed 5 years ago

erszcz commented 5 years ago

I'm preparing a short talk on Gradualizer for our local community. One of the basic arguments for static typing I wanted to show is that case expressions are checked for all alternatives of an ADT. For example:

data Allergen = Eggs
              | Chocolate
              | Pollen
              | Cats

allergenScore :: Allergen -> Int
allergenScore al = case al of
  Eggs         ->   1
  Chocolate    ->  32
  Pollen       ->  64

GHC will warn us that Cats are not taken into account:

$ ghc -Wincomplete-patterns src/Allergies.hs
[1 of 1] Compiling Allergies        ( src/Allergies.hs, src/Allergies.o )

src/Allergies.hs:13:20: warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In a case alternative: Patterns not matched: Cats
   |
13 | allergenScore al = case al of
   |                    ^^^^^^^^^^...

As will the PureScript compiler (which is somewhat more interesting due to its Erlang backend):

$ purs compile src/Allergies.purs
Compiling Allergies
Error found:
in module Allergies
at src/Allergies.purs line 13, column 20 - line 16, column 20

  A case expression could not be determined to cover all inputs.
  The following additional cases are required to cover all inputs:

    Cats

  Alternatively, add a Partial constraint to the type of the enclosing value.

Gradualizer as of now doesn't seem to provide such a warning for a literal translation of the example:

-type allergen() :: eggs
                  | chocolate
                  | pollen
                  | cats.

-spec allergen_score(allergen()) -> integer().
allergen_score(Al) ->
    case Al of
        eggs         ->  1;
        chocolate    -> 32;
        pollen       -> 64
    end.

Should we expect that it will in the future?

zuiderkwast commented 5 years ago
  1. The use of non-exhaustive cases is considered good style in the Erlang community according to the let it crash principle. Catch-all clauses are considered bad style. Perhaps we can implement it anyway, but I would suggest an option for this, perhaps per function, e.g. -gradualizer([{exhastiveness_checking, [allergen_score/1]}]).

  2. Your example is easy to implement, especially now since type refinement is implemented for simple cases without guards. It would be harder though, or impossible, for a function with complex guards. We would have to describe in which cases you can expect the gradualizer to do exhastiveness checking or perhaps we can issue a warning if exhastiveness checking is enabled for a function where there the gradualizer is unable to determine if the clauses are exhastive.

erszcz commented 5 years ago

Thanks for a prompt response!

The use of non-exhaustive cases is considered good style in the Erlang community according to the let it crash principle.

Indeed, but I'm not thinking about catch-all clauses in this case. For example, let's assume we're extending/refactoring a piece of code and we've just added the cats allergen - I believe it would be a great productivity boon if Gradualizer could tell us about all case expressions across the project matching on allergen, which might need extending with the new alternative - yet let us decide if we do extend them or not.

It would be harder though, or impossible, for a function with complex guards.

Hmmm, so rephrasing your words two kinds of warnings would make the most sense:

  1. Non-exhaustive patterns for type t().
  2. Cannot determine case expression exhaustiveness for t().

That would be awesome!

zuiderkwast commented 5 years ago

Exactly! Two kinds of warnings.

I do understand the value of this. Still, we don't warnings for this where you actually want non-exhaustive clauses, so an option is needed (unless someone can come up with some clever heuristics for this).

OvermindDL1 commented 5 years ago

Perhaps case's should be setup like how polymorphic variants are in OCaml?

For reference, here is how they work in OCaml:

# let h = function
| `Foo -> `Bar
| `Bar -> `Foo    

# val h : [< `Bar | `Foo ] -> [> `Bar | `Foo ] = <fun>

So this function takes a closed set of polymorphic variants (the < means it must be any subset of this set) and returns an open set of polymorphic variants (the > means it currently does return one of the things in this set based on the input, but it could potentially return more in the future).

What this means is that if you match over a closed set then you get proper warnings/errors for non-exhaustive cases, and if you match over an open set then you get a warning/error for not having a fallthrough match _ -> ....

Thus I'd propose that the or | in typespecs defines a closed variant type by default, you get warnings without handling all cases (that could be ignored on a file-by-file or line-by-line basis or part of the (extended) spec) but otherwise will still compile. Such a message should not propagate outside out the type, only within the body, so given something like:

-type allergen() :: eggs
                  | chocolate
                  | pollen
                  | cats.

-spec allergen_score(allergen()) -> integer().
allergen_score(Al) ->
    case Al of
        eggs         ->  1;
        chocolate    -> 32;
        pollen       -> 64
    end.

test() ->
    allergen_score(cats).

Would cause a warning to be printed on the case Al of expression, but would not cause an issue on the allergen_score(cats) expression as it is handled by the type properly. If someone wants to silence the expression then perhaps some type declaration can state that. If someone wants a specifically open type then that could be some other type declaration as well.

But regardless, it should still compile, just noisely and in a way where the warnings can be silenced through other means (extended/external type declarations?) and the warnings could still become hard errors if they fail that regardless.

I really want exhaustiveness checking. :-)

erszcz commented 5 years ago

I've looked for some resources on the topic and found:

I have no idea if these materials will ever be useful - I want to read them, but haven't done so yet. As of now I have even less of an idea if it could be applied to Gradualizer or the required changes would be too significant. Anyway, it can't hurt to leave the links here.

josefs commented 5 years ago

As @zuiderkwast said, we currently have some of the needed machinery in place already, which is used for type refinement. It just needs a willing hand :-)

erlandsona commented 5 years ago

Any updates on this? Seems like sort of a critical thing that this tool would do?

zuiderkwast commented 5 years ago

Yes, #188 is ongoing!

josefs commented 5 years ago

The allergy example from the top comment now produces the following error in Gradualizer (thanks to #188):

Nonexhaustive patterns on line 15 at column 9
Example values which are not covered: [cats]

Closing this ticket.

Zalastax commented 5 years ago

A small thing about formatting. Is it a good idea to use the list syntax for printing the example values? It's not perfectly clear that the example value is cats and not [cats].

josefs commented 5 years ago

@Zalastax, right. The list represent the number of arguments we have when pattern matching. For function clauses we can have several arguments that we match against. But for case expressions there is ever only one value that we match on. I agree that it makes sense to remove the list when we have a case expression. One idea to improve the error message for function clauses is to use ordinary parentheses just like the syntax for defining clauses.