antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.26k stars 3.72k forks source link

tsql: Occasional STRING token mismatch #1092

Open garymazz opened 6 years ago

garymazz commented 6 years ago

For some odd reason the token is mismatched to a valid string:

Test Values:

N'WWW'
'WWW'
N'W''W''W'
'W''W''W'

Token in TSqlLexer.g4:

STRING: 'N'? '\'' (~'\'' | '\'\'')* '\'';

garymazz commented 6 years ago

A quick fix

Alter token in TSqlLexer.g4 to: STRING: ('N' '\'' (~'\'' | '\'''\'')* '\'') | ('\'' (~'\'' | '\'''\'')* '\'') ;

KvanTTT commented 6 years ago

Maybe the more short record?

STRING: 'N'? '\'' (~'\'' | '\'\'')* '\'';
garymazz commented 6 years ago

The shorter version was not matching using 4.7.1.

KvanTTT commented 6 years ago

What tokens did you get for rule STRING: 'N'? '\'' (~'\'' | '\'\'')* '\'' with ANTLR 4.7.1?

garymazz commented 6 years ago

It was weird... I got the tokens, but they were not matching. In one case text N'text' would not match. In other's "text' would not match. I'm thinking it may be an antlr bug(??).. I need to get through my first pass of the grammar, then loop back around for some deeper dives.

KvanTTT commented 6 years ago

Maybe there is a tokens ambiguity.

garymazz commented 6 years ago

Thanks Ivan... I was thinking the same way.. I'll look into it in more detail at the end of the week. Right now I have some issues related to context aware parsing

On Sun, Apr 1, 2018 at 5:14 PM, Ivan Kochurkin notifications@github.com wrote:

Maybe there is a tokens ambiguity.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/antlr/grammars-v4/issues/1092#issuecomment-377824188, or mute the thread https://github.com/notifications/unsubscribe-auth/ABobgtyKdh3ctuUboRLZwWrSjRElA1Paks5tkV9LgaJpZM4TCC0U .