Closed clueless-skywatcher closed 9 months ago
You need to use $ so that the code gen uses the correct reference code
$xxx
And so on.
On Sat, Feb 24, 2024 at 12:05 Epsilonator @.***> wrote:
Target Language: Java Antlr Version: 4.13.2 IDE: Visual Studio Code (Version 1.86.2) IDE Extension: ANTLR4 grammar syntax support (Version 2.4.6) Build System: Gradle
I am following a video to create a parser, and I need to parse strings and capture the parsed string in a variable. The rule for matching strings is as follows
STRING : '"' { StringBuilder b = new StringBuilder(); } (c=~('\n' | '\r' | '"') { b.appendCodePoint(c); })* '"' {setText(b.toString());} ;
At the equals sign (next to the c in the next line after the StringBuilder initialization), ANTLR complains
syntax error: '=' came as a complete surprise to me while looking for lexer rule element
while no such issues popped up on the video.
Any help to resolve this issue and pointing out where I am going wrong would be highly appreciated. Is this due to a version mismatch?
— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4542, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMEH5UK6OBL6U6KV4ILYVITVNAVCNFSM6AAAAABDYFUTJSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2TENBSGY2TSNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@jimidle I fixed that just now and it's still throwing the error
STRING
: '"'
{ StringBuilder b = new StringBuilder(); }
(c=~('\n' | '\r' | '"') { $b.appendCodePoint(c); })*
'"'
{setText($b.toString());}
;
Where do I need to add the $ sign?
Ah. I see what you are doing. Your atribgbuikder is out of scope. You need to declare it in the @decls {} section. Ignite the $ comment. That’s for labels in your grammar
On Sat, Feb 24, 2024 at 13:16 Epsilonator @.***> wrote:
@jimidle https://github.com/jimidle I fixed that just now and it's still throwing the error
STRING : '"' { StringBuilder b = new StringBuilder(); } (c=~('\n' | '\r' | '"') { $b.appendCodePoint(c); })* '"' {setText($b.toString());} ;
Where do I need to add the $ sign?
— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4542#issuecomment-1962584956, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMBBCTTTZ3KCO7PVQLDYVI37NAVCNFSM6AAAAABDYFUTJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSGU4DIOJVGY . You are receiving this because you were mentioned.Message ID: @.***>
@jimidle The problem is not with the StringBuilder here. It's just that ANTLR refuses to acknowledge the "=" sign beside the "c". Declaring the StringBuilder in the decls section still didn't fix the issue.
Here is the full grammar for reference
grammar Lark;
options {
language = Java;
}
@header {
import java.util.*;
}
@decls {
StringBuilder idB = new StringBuilder();
}
prog
: stmt+
;
stmt
: expr ';'
| term ';'
| assign ';'
| functionDef ';'
| functionAnonDef ';'
| functionCall ';'
;
functionCall
: IDENTIFIER '(' actualParams? ')'
;
actualParams
: expr (',' expr)*
;
term
: IDENTIFIER
| '(' expr ')'
| INTEGER
| DECIMAL
| STRING
| CHARACTER
| IDENTIFIER '(' actualParams? ')'
;
assign
: id=IDENTIFIER ':=' expr {
System.out.println($id.text);
System.out.println($expr.text);
}
;
returnStmt: 'return' expr ';';
functionDef
: '<' IDENTIFIER '>' ':=' '(' params? ')' '->' '{'(stmt | returnStmt)*'}'
; // <Func> := (a, b, c) -> {
// DoThings();
// }
functionAnonDef
: '<' IDENTIFIER '>'
;
params
: param (',' param)*
;
param
: IDENTIFIER
;
negate
: '~'* term
;
unary
: ('+' | '-')* negate
;
exponent
: unary ('^' unary)*
;
multiply
: exponent (('*' | '/' | '%') exponent)*
;
add
: multiply (('+' | '-') multiply)*
;
relation
: add (('=' | '!=' | '<' | '<=' | '>=' | '>') add)*
;
expr
: relation (('and' | 'or') relation)*
;
INTEGER: DIGIT+;
DECIMAL: DIGIT+ '.' DIGIT+;
STRING
: '"'
(c=~('\n' | '\r' | '"') { idB.appendCodePoint(c); })*
'"'
{setText(idB.toString());}
;
CHARACTER
: '\'' . '\'' { setText(getText().substring(1, 2)); }
;
fragment LETTER: [a-zA-Z];
fragment DIGIT: [0-9];
// NEWLINE: '\n';
IDENTIFIER: LETTER (LETTER | DIGIT)* ;
WS: [ \t\n\r\f]+ -> channel(HIDDEN);
@jimidle I believe this is a bug, since replacing the RHS of c=~('\n' | '\r' | '"')
with any other rule or character still throws the same issue, even after removing all the StringBuilder stuff. Will make a workaround in the visitor later on, but any help is still appreciated.
Have you tried adding $c instead of c?
STRING is a lexer symbol. You can't define an attribute (c = .....
) in a lexer rule. (See https://github.com/antlr/grammars-v4/pull/3205.)
After tinkering around for a bit I made a workaround like this
STRING
: '"'
~('\n' | '\r')*
'"'
{ setText(getText().substring(1, getText().length() - 1)); }
;
and it currently is working fine for my usecase. Posting this here to help others when they face a similar issue.
Because it supports many languages you cannot declare it and init it. Just declare it there and either use @init{} to declare it or init in code. Look for examples.
On Sat, Feb 24, 2024 at 13:26 Epsilonator @.***> wrote:
Here is the full grammar for reference
grammar Lark;
options { language = Java; }
@header { import java.util.*; }
@decls { StringBuilder idB = new StringBuilder(); }
prog : stmt+ ;
stmt : expr ';' | term ';' | assign ';' | functionDef ';' | functionAnonDef ';' | functionCall ';' ;
functionCall : IDENTIFIER '(' actualParams? ')' ;
actualParams : expr (',' expr)* ;
term : IDENTIFIER | '(' expr ')' | INTEGER | DECIMAL | STRING | CHARACTER | IDENTIFIER '(' actualParams? ')' ;
assign : id=IDENTIFIER ':=' expr { System.out.println($id.text); System.out.println($expr.text); } ;
returnStmt: 'return' expr ';';
functionDef : '<' IDENTIFIER '>' ':=' '(' params? ')' '->' '{'(stmt | returnStmt)*'}' ; //
:= (a, b, c) -> { // DoThings(); // } functionAnonDef : '<' IDENTIFIER '>' ;
params : param (',' param)* ;
param : IDENTIFIER ;
negate : '~'* term ;
unary : ('+' | '-')* negate ;
exponent : unary ('^' unary)* ;
multiply : exponent (('' | '/' | '%') exponent) ;
add : multiply (('+' | '-') multiply)* ;
relation : add (('=' | '!=' | '<' | '<=' | '>=' | '>') add)* ;
expr : relation (('and' | 'or') relation)* ;
INTEGER: DIGIT+; DECIMAL: DIGIT+ '.' DIGIT+;
STRING : '"' (c=~('\n' | '\r' | '"') { idB.appendCodePoint(c); })* '"' {setText(idB.toString());} ; CHARACTER : '\'' . '\'' { setText(getText().substring(1, 2)); } ; fragment LETTER: [a-zA-Z]; fragment DIGIT: [0-9];
// NEWLINE: '\n';
IDENTIFIER: LETTER (LETTER | DIGIT)* ;
WS: [ \t\n\r\f]+ -> channel(HIDDEN);
— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4542#issuecomment-1962606679, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMHRD2H7PSWESCJD5OLYVI5IHAVCNFSM6AAAAABDYFUTJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRSGYYDMNRXHE . You are receiving this because you were mentioned.Message ID: @.***>
Closing this issue since my problem is solved.
Target Language: Java Antlr Version: 4.13.2 IDE: Visual Studio Code (Version 1.86.2) IDE Extension: ANTLR4 grammar syntax support (Version 2.4.6) Build System: Gradle
I am following a video to create a parser, and I need to parse strings and capture the parsed string in a variable. The rule for matching strings is as follows
At the equals sign (next to the "c" label in the next line after the StringBuilder initialization), ANTLR complains
while no such issues popped up on the video.
Any help to resolve this issue and pointing out where I am going wrong would be highly appreciated. Is this due to a version mismatch since in the video they were using ANTLR 3.x?