uwol / proleap-vb6-parser

ProLeap ANTLR4-based parser for Visual Basic 6.0
MIT License
79 stars 26 forks source link

Line numbers not supported #7

Closed zbagz closed 7 years ago

zbagz commented 7 years ago
1    MsgBox("Line 1")
2    MsgBox("Line 2")
3    MsgBox("Line 3")

line 116:0 extraneous input '1' expecting {END_SUB, NEWLINE}

Not sure how complex this is to implement honestly. I can probably get around it by stripping out line numbers before sending the files to the parser, but maybe somebody else needs it, who knows.

Thanks.

uwol commented 7 years ago

Line numbers just like precompilation directives are something, that require a preprocessor. That also was something I had to develop in case of cobol85parser.

From a VB6 grammar or syntax perspective, lines and token positions are transparent. The lexer collects all tokens from the VB6 source code and strips information about char positions. The parser then evaluates the token stream for syntax without regard to char positions. Thus it cannot identify a line number as the first integer literal in a line.

However preprocessing for stripping line numbers could be achieved by something like this:

Scanner scanner = new Scanner(...);
StringBuffer sb = new StringBuffer();

while(scanner.hasNextLine()){
  String line = scanner.nextLine();
  String strippedLine = strip line number with regex or choose substring of line
  sb.append(strippedLine);
}

scanner.close();
return sb.toString();
zbagz commented 7 years ago

Got it, I'm stripping them out with regex now. Thanks for the superb explanation as usual Ulrich.