unisonweb / unison

A friendly programming language from the future
https://unison-lang.org
Other
5.75k stars 269 forks source link

Change handling of “blank” identifiers #5282

Closed sellout closed 1 month ago

sellout commented 1 month ago

Overview

The proximate bug was that the lexer for Blank would “steal” the first segment of an identifier. E.g., _a.blah would be lexed as [Blank "a", WordyId ".blah"], rather than [WordyId "_a.blah"]. However, fixing that directly was fragile, and left other cases like the one in #4681.

Also, most uses of Blank exactly mirrored WordyId, often rebuilding the original segment.

This PR removes the Blank token, instead treating it as any other WordyId. The parser now checks identifiers for “blankness” as needed.

Fixes #2822.

Implementation notes

There were two places that treated Blank differently than WordyId, and those are preserved. There were also two places where a “true” Blank (_) was treated differently than a suffixedBlank (_withSomeSuffix), and those have been eliminated.

I considered storing “blankness” in an extra field of WordyId, but that suffers from boolean blindness and also means that it’s possible to create inconsistent terms (e.g., WordyId IsBlank "definitely.not.blank").

Test coverage

There is a new transcript with examples pulled from both #2822 and #4681, as well as other coverage I thought was useful.

ChrisPenner commented 1 month ago

Ooh, does this now allow _ as a type-hole? If so I'm stoked on it 🙌🏼

sellout commented 1 month ago

Ooh, does this now allow _ as a type-hole? If so I'm stoked on it 🙌🏼

If you mean

  I couldn't figure out what _ refers to here:

      4 | > x _ +3

  I think its type should be:

      Int

instead of

  I got confused here:

      4 | > x _ +3

  I was surprised to find a  here.
  I was expecting one of these instead:

  * and
  * bang
  * …

then yes!