Closed Lurgrid closed 1 month ago
You can use lrlex
as a binary to test the lex file alone. In this case, line 3 column 8 is because (AFAICS) you don't have a lexing rule that matches toto
.
I don't understand why the rule <SEP>[^\r\n]+ "STRING"
isn't supposed to match the string toto
because it will have read SEP
and it's a non-empty string of characters that isn't a carriage return or end of line.
I don't know what <SEP>
is supposed to be. It may be lex syntax we don't support. I think lrlex will treat it as a literal <SEP>
but I might be wrong.
As far as I know, to use a state in LEX, you have to add %x STATE
, where STATE
is your state, at the beginning of the file. Then use BEGIN(STATE)
in one of the rules. Here's an example:
%x SEP
%%
: { BEGIN(SEP); return SEP; }
PRODID { return PRODID; }
<SEP>[^\r\n]+ { BEGIN(INITIAL); return STRING; }
%%
So given that you say this in your book “Lex uses a special action expression BEGIN(state) to switch to the named state. grmtools lex files use a token name prefix.”
Here's the code I would have done to redo the previous LEX code
%x SEP
%%
: "SEP"
PRODID “PRODID”
<SEP>[^\r\n]+ “STRING”
In my code SEP
just stands for “:”
I think I must have misunderstood something
Ah, I'm not familiar with the state stuff. @ratmice might understand that part better than I do.
Blurry eyed and half asleep, but looking at it I don't see where you are entering the SEP
state,
the rule \: "SEP"
doesn't do so, it returns a "SEP" token.
Because lrlex doesn't run code actions, it uses a notion of a state operator to begin and end states.
To actually enter the SEP state you'll want a rule like : <+SEP>;
there is an example in lrpar/examples/start_states
Edit: I really need to work on adding a section to the book for this.
Ohh okay, thanks a lot! So my LEX should be like this
%x SEP
%%
: <+SEP>"SEP"
BEGIN "BEGIN"
END "END"
<SEP>VCALENDAR <-SEP>"VCALENDAR"
METHOD "METHOD"
PRODID "PRODID"
VERSION "VERSION"
CALSCALE "CALSCALE"
<SEP>VEVENT <-SEP>"VEVENT"
DTSTAMP "DTSTAMP"
DTSTART "DTSTART"
DTEND "DTEND"
SUMMARY "SUMMARY"
LOCATION "LOCATION"
DESCRIPTION "DESCRIPTION"
UID "UID"
CREATED "CREATED"
LAST-MODIFIED "LAST-MODIFIED"
SEQUENCE "SEQUENCE"
<SEP>REQUEST <-SEP>"REQUEST"
<SEP>GREGORIAN <-SEP>"GREGORIAN"
<SEP>[0-9]{8}T[0-9]{6}Z <-SEP>"DATE"
<SEP>[1-9][0-9]* <-SEP>"NUM"
<SEP>[1-9][0-9]*\.[0-9]* <-SEP>"FLOAT"
<SEP>[^\r\n]+ <-SEP>"STRING"
[\r\n]+ ;
Indeed that look more like what I would expect, I've used ... <-SEP>"DATE"
syntax before so I would expect <+SEP>"SEP"
to work, but looking through the parsers I've written using start states, I don't see any examples where I've entered a state and returned a token so definitely let us know if you run into anything unexpected.
<+SEP>“SEP”
works very well but it's true that I don't need to send it to my parser
Cool, well sounds to me like this is fixed. Feel free to reopen if i'm mistaken, or open another issue if you run into anything else.
Hello,
I'm making an ICAL file parser which respects a simpler format and I need to use prefixes to detect strings in ICAL files but it doesn't seem to work in my case, I don't know if I've misunderstood how to use it or not
Here's my LEX
``` %x SEP %% \: "SEP" BEGIN "BEGIN" END "END" VCALENDAR "VCALENDAR" METHOD "METHOD" REQUEST "REQUEST" PRODID "PRODID" VERSION "VERSION" CALSCALE "CALSCALE" GREGORIAN "GREGORIAN" VEVENT "VEVENT" DTSTAMP "DTSTAMP" DTSTART "DTSTART" DTEND "DTEND" SUMMARY "SUMMARY" LOCATION "LOCATION" DESCRIPTION "DESCRIPTION" UID "UID" CREATED "CREATED" LAST-MODIFIED "LAST-MODIFIED" SEQUENCE "SEQUENCE" [0-9]{8}T[0-9]{6}Z "DATE" [1-9][0-9]* "NUM" [1-9][0-9]*\.[0-9]* "FLOAT"Here's my Yacc
``` %start Cal %avoid_insert "BEGIN" %avoid_insert "END" %avoid_insert "VCALENDAR" %avoid_insert "METHOD" %avoid_insert "REQUEST" %avoid_insert "PRODID" %avoid_insert "VERSION" %avoid_insert "CALSCALE" %avoid_insert "GREGORIAN" %avoid_insert "VEVENT" %avoid_insert "DTSTAMP" %avoid_insert "DTSTART" %avoid_insert "DTEND" %avoid_insert "SUMMARY" %avoid_insert "LOCATION" %avoid_insert "DESCRIPTION" %avoid_insert "UID" %avoid_insert "CREATED" %avoid_insert "LAST-MODIFIED" %avoid_insert "SEQUENCE" %avoid_insert "NUM" %avoid_insert "FLOAT" %avoid_insert "DATE" %avoid_insert "STRING" %avoid_insert "SEP" %% Cal -> (): 'BEGIN' 'SEP' 'VCALENDAR' 'METHOD' 'SEP' 'REQUEST' 'PRODID' 'SEP' 'STRING' 'VERSION' 'SEP' 'FLOAT' 'CALSCALE' 'SEP' 'GREGORIAN' LEvent 'END' 'SEP' 'VCALENDAR' {} ; LEvent -> (): %empty {} | Event LEvent {} ; Event -> (): 'BEGIN' 'SEP' 'VEVENT' 'DTSTAMP' 'SEP' 'DATE' 'DTSTART' 'SEP' 'DATE' 'DTEND' 'SEP' 'DATE' 'SUMMARY' 'SEP' 'STRING' 'LOCATION' 'SEP' 'STRING' 'DESCRIPTION' 'SEP' 'STRING' 'UID' 'SEP' 'STRING' 'CREATED' 'SEP' 'DATE' 'LAST-MODIFIED' 'SEP' 'DATE' 'SEQUENCE' 'SEP' 'NUM' 'END' 'SEP' 'VEVENT' {} ; %% ```Here's my test file
``` BEGIN:VCALENDAR METHOD:REQUEST PRODID:toto VERSION:2.0 CALSCALE:GREGORIAN END:VCALENDAR ```And here's the error I get when I test