rikvdkleij / intellij-haskell

IntelliJ plugin for Haskell
https://rikvdkleij.github.io/intellij-haskell/
Apache License 2.0
1.31k stars 94 forks source link

UTF letters are not accepted in symbol names #679

Open majkrzak opened 2 years ago

majkrzak commented 2 years ago

When defining a following type:

data Letter = A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z|Ä|Ö

code analizer screams with following error:

<constr>, <q name>, <top declaration>, HaskellTokenType.AT, HaskellTokenType.BACKQUOTE, HaskellTokenType.BACKSLASH, HaskellTokenType.CASE, HaskellTokenType.CHARACTER_LITERAL, HaskellTokenType.CLASS, HaskellTokenType.COLON_COLON, HaskellTokenType.COMMA, HaskellTokenType.DATA, HaskellTokenType.DECIMAL, HaskellTokenType.DEFAULT, HaskellTokenType.DERIVING, HaskellTokenType.DIRECTIVE, HaskellTokenType.DO, HaskellTokenType.DOT, HaskellTokenType.DOUBLE_QUOTES, HaskellTokenType.DOUBLE_RIGHT_ARROW, HaskellTokenType.ELSE, HaskellTokenType.EQUAL, HaskellTokenType.FLOAT, HaskellTokenType.HEXADECIMAL, HaskellTokenType.IF, HaskellTokenType.IMPORT, HaskellTokenType.IN, HaskellTokenType.INCLUDE_DIRECTIVE, HaskellTokenType.INFIX, HaskellTokenType.INFIXL, HaskellTokenType.INFIXR, HaskellTokenType.INSTANCE, HaskellTokenType.LEFT_ARROW, HaskellTokenType.LEFT_BRACE, HaskellTokenType.LEFT_BRACKET, HaskellTokenType.LEFT_PAREN, HaskellTokenType.LET, HaskellTokenType.LIST_COMPREHENSION, HaskellTokenType.MODULE, HaskellTokenType.NEWLINE, HaskellTokenType.NEWTYPE, HaskellTokenType.OCTAL, HaskellTokenType.OF, HaskellTokenType.PRAGMA_START, HaskellTokenType.QUASIQUOTE, HaskellTokenType.QUOTE, HaskellTokenType.RIGHT_ARROW, HaskellTokenType.RIGHT_BRACE, HaskellTokenType.RIGHT_BRACKET, HaskellTokenType.RIGHT_PAREN and ... expected, got 'Ä'
majkrzak commented 2 years ago

As far as I understand it is because lexer is only to Latin and Greek letters: https://github.com/rikvdkleij/intellij-haskell/blob/6c1dc71595d16274732e0c2ee4e489255fe1ba16/src/main/scala/intellij/haskell/_HaskellLexer.flex#L45-L46

Accoriding to Haskell 2010 Report §2.2 it should support "any Unicode lowercase letter" and "any uppercase or titlecase Unicode letter" accordingly.