ocaml / merlin

Context sensitive completion for OCaml in Vim and Emacs
https://ocaml.github.io/merlin/
MIT License
1.59k stars 232 forks source link

Wildcard constructor patterns display an incorrect type on hover #1792

Open Innf107 opened 4 months ago

Innf107 commented 4 months ago

Given a data constructor with multiple parameters and a constructor pattern with a wildcard like this, hovering over the _ in VSCode shows type string even though it should be int * string.

type t = A of int * string
let f (A _) = 5

Arguably, int * string isn't quite correct either since this is a data constructor with two arguments, not with a single tuple argument and e.g. A x wouldn't compile, but string is definitely wrong.

In my experience, this comes up pretty often when trying to figure out the type of a constructor's arguments when pattern matching on it. The alternative is to guess the number of arguments the constructor has and write out A (_, _) manually, but this is a little annoying if it has more than 3 arguments.

voodoos commented 4 months ago

Thanks for your report @Innf107 Looks like there is an associativity issue in Merlin: adding parenthesis type t = A of (int * string) fixes the issue.

Reproduction:

  $ cat >test.ml <<'EOF'
  > type t = A of (int * string)
  > let f (A _) = 5
  > EOF

  $ $MERLIN single type-enclosing -position 2:9 \
  > -filename test.ml <test.ml | jq '.value[0].type'
  "int * string"

  $ cat >test.ml <<'EOF'
  > type t = A of int * string
  > let f (A _) = 5
  > EOF

FIXME: wrong associativity, result should be int * string
  $ $MERLIN single type-enclosing -position 2:9 \
  > -filename test.ml <test.ml | jq '.value[0].type'
  "string"
Innf107 commented 4 months ago

It's not quite the same type with parentheses though. In that case it is a single-argument constructor that takes a tuple, so e.g.

type t1 = A of int * string
type t2 = B of (int * string)

let f (A x) = x (* fails because A takes two arguments but is only given one *)
let g (B x) = x (* compiles *)
voodoos commented 4 months ago

Oh you're right. Also it's not an associativity issue else the wrong result would be int, not string.

voodoos commented 4 months ago

@xvw

Looks like this is de-sugared by the parser into: A _ -> A (_, _) with both wildcards sharing the location of the source one. Merlin only considers the last one which has type string. It might be tricky to solve properly.

Typedtree:

    structure_item (test.ml[2,27+0]..test.ml[2,27+15])
      Tstr_value Nonrec
      [
        <def>
          pattern (test.ml[2,27+4]..test.ml[2,27+5])
            Tpat_var \"f/278\"
          expression (test.ml[2,27+6]..test.ml[2,27+15]) ghost
            Texp_function
            [
              Nolabel
              Param_pat
                pattern (test.ml[2,27+6]..test.ml[2,27+11])
                  Tpat_construct \"A\"
                  [
                    pattern (test.ml[2,27+9]..test.ml[2,27+10])
                      Tpat_any
                    pattern (test.ml[2,27+9]..test.ml[2,27+10])
                      Tpat_any
                  ]
                  None
            ]
            Tfunction_body
              expression (test.ml[2,27+14]..test.ml[2,27+15])
                Texp_constant Const_int 5
      ]
xvw commented 4 months ago

Yes, I'll take some time for destruct on Thursday and I'll try to find a solution