Open Dokotela opened 2 years ago
Hi,It seems that you are expecting antlr4 to produce different tokens depending on the context.This is not supported by antlr4 , tokens are produced prior to being submitter to parsing rules.I strongly suggest that you split your grammar between Lexer and parser rules , such that you can control the precedence of token production.And since ‘div’ is both a token and a valid identifier. You probably need a grammar rule as follows:identifier: IDENTIFIER | DIV;Envoyé de mon iPhoneLe 22 sept. 2022 à 05:00, Grey Faulkenberry, MD MPH @.***> a écrit : I have an issue where I seem to get the incorrect tokens for one of my inputs when I'm using antlr4 in Dart.
To start, I have installed ANTLR Parser Generator Version 4.11.1. I'm using the following grammar: http://hl7.org/fhirpath/N1/fhirpath.g4 I generate the files using the following expression: java org.antlr.v4.Tool -Dlanguage=Dart -no-listener -visitor fhirpath.g4 All of the files seem to generate correctly, and in general I thought they were working well. But then when I use the input String "Patient.text.div" I get an error. To demonstrate this, if you generate the files as I did above, and then run this function:
import 'package:antlr4/antlr4.dart';
import 'fhirpathLexer.dart'; import 'fhirpathParser.dart';
void main() {
final input = InputStream.fromString("Patient.text.div");
final lexer = fhirpathLexer(input);
// print(lexer.allTokens);
final tokens = CommonTokenStream(lexer);
final parser = fhirpathParser(tokens);
parser.buildParseTree = true;
final tree = parser.expression();
}
You will see that it gives the following error message:
line 1:13 mismatched input 'div' expecting {'is', 'as', 'in', 'contains', '$this', '$index', '$total', IDENTIFIER, DELIMITEDIDENTIFIER}
line 1:16 mismatched input '
If I uncomment out the lexer.allTokens line from the function, it seems to demonstrate where the problem is. @.,0:6='Patient',<58>,1:0], @.,7:7='.',<1>,1:7], @.,8:11='text',<58>,1:8], @.,12:12='.',<1>,1:12], @.***,13:15='div',<8>,1:13]]
So instead of interpreting 'div' as an IDENTIFIER <58>, like it does with 'Patient' and 'text', it interprets it as a #multiplicativeExpression. Which it can be, but not in this case. I can understand how it would make this error, the first part of the grammar is: expression : term #termExpression | expression '.' invocation #invocationExpression | expression '[' expression ']' #indexerExpression | ('+' | '-') expression #polarityExpression | expression ('*' | '/' | 'div' | 'mod') expression #multiplicativeExpression
I have simplified the above grammar as much as I can and still produce the bug, it's here: grammar fhirpath;
expression : term #termExpression | expression '.' invocation #invocationExpression | expression ('*' | '/' | 'div' | 'mod') expression #multiplicativeExpression ;
term : invocation #invocationTerm | literal #literalTerm ;
literal : STRING #stringLiteral ;
invocation // Terms that can be used after the function/member invocation '.' : identifier #memberInvocation ;
identifier : IDENTIFIER ;
IDENTIFIER : ([A-Za-z] | '')([A-Za-z0-9] | '')* // Added _ to support CQL (FHIR could constrain it out) ;
STRING : '\'' (ESC | .)*? '\'' ;
fragment ESC
: '\' (['\\/fnrt] | UNICODE) // allow \
, \', \, \/, \f, etc. and \uXXX
;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX : [0-9a-fA-F] ;
Thank you for any help or suggestions.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
I have an issue where I seem to get the incorrect tokens for one of my inputs when I'm using antlr4 in Dart.
java org.antlr.v4.Tool -Dlanguage=Dart -no-listener -visitor fhirpath.g4
"Patient.text.div"
I get an error.import 'fhirpathLexer.dart'; import 'fhirpathParser.dart';
void main() { final input = InputStream.fromString("Patient.text.div"); final lexer = fhirpathLexer(input); // print(lexer.allTokens); final tokens = CommonTokenStream(lexer); final parser = fhirpathParser(tokens); parser.buildParseTree = true; final tree = parser.expression(); }
line 1:13 mismatched input 'div' expecting {'is', 'as', 'in', 'contains', '$this', '$index', '$total', IDENTIFIER, DELIMITEDIDENTIFIER} line 1:16 mismatched input '' expecting {'+', '-', 'is', 'as', 'in', 'contains', '(', '{', 'true', 'false', '%', '$this', '$index', '$total', DATE, DATETIME, TIME, IDENTIFIER, DELIMITEDIDENTIFIER, STRING, NUMBER}
[[@-1,0:6='Patient',<58>,1:0], [@-1,7:7='.',<1>,1:7], [@-1,8:11='text',<58>,1:8], [@-1,12:12='.',<1>,1:12], [@-1,13:15='div',<8>,1:13]]
expression : term #termExpression | expression '.' invocation #invocationExpression | expression '[' expression ']' #indexerExpression | ('+' | '-') expression #polarityExpression | expression ('*' | '/' | 'div' | 'mod') expression #multiplicativeExpression
Thank you for any help or suggestions.