erg-lang / erg

A statically typed language compatible with Python
http://erg-lang.org
Apache License 2.0
2.65k stars 55 forks source link

Fixed an issue where multiple chunks can be declared on a single line #291

Closed GreasySlug closed 1 year ago

GreasySlug commented 1 year ago

Fixes #281

It is possible to declare multiple chunks on a single line with whitespace

It is an invalid syntax and should be detected as an error in the parser

GreasySlug commented 1 year ago

Spaces are basically removed when chars are converted to tokens. This change requires a high degree of knowledge of Parser and Erg language syntax, as the sequence of tokens changes the meaning. I do not have that, so any further changes would be difficult to deal with.

The issue I'm concerned about is the difficulty in processing EOF There is this issue that has not yet been addressed by the block, etc. A new line basically solves the problem, but EOFs after tokens are usually a source of error.

I think it is possible to add Dedent and EOF processing to everything, but it is not a very good implementation.

GreasySlug commented 1 year ago

If there was no separator, parsing was continued skipping the error point. This resulted in multiple syntax errors that were detected as errors in modules, etc. Therefore, analysis is now stopped when an error occurs.

I'm still in the process of adding comments. The EOF issue is not taken into account.

# declaration
1 | a = 1 b = 1
  : -----
  :     `- should add `;` or newline

# set
1 | {1 2 3}
  :  -
  :  `- should add `,` or newline

# arry
1 | [1 2 3]
  :  -
  :  `- should add `,` or newline

# record
Error[#0161]: File <stdin>, line 1, 

1 | a = {name = "john" age = 21}
  :      -------------
  :                  `- should add `;` or newline

# class
Error[#0161]: File <stdin>, line 1, 

1 | C = Class {name = Str age = Nat }
  :            --------------
  :                         `- should add `;` or newline
GreasySlug commented 1 year ago

Detects as an error if there is no delimiter or line break after chunk and there is a token After detection, since the token behind the detection is often not only expr, it is ignored until TC::Separator.

GreasySlug commented 1 year ago

Tuple has different error message, and Class has different error location

1 | 1 2 3
  : -
  : `- should add `;` or newline

SyntaxError: invalid syntax

7 | Class {name = Str age = Nat}
  :        --------------
  :                     `- should add `;` or newline

I think these issues would be better to create another PR instead of this one.

GreasySlug commented 1 year ago

i = 2 j = 1 and { "a": 1 "b": 1 } are different kinds of problems in Erg parsing.

The comma error in the collection is a different issue than this PR and has been removed from this. So, I squashed commits.