Closed krishna116 closed 2 years ago
If you need the whole matched text, then you don't need tags: the match begins at start position of YYCURSOR
and ends at the final position of YYCURSOR
. If you want to extract submatch in the middle of input, then you need tags. See the two examples below with and without tags (in your example they are not necessary).
There is no automatic yytext
because re2c does not on its own allocate memory and create copies of the input text (this would be too expensive, as the user often doesn't need the copy). If you need a copy, you can easily create one as std::string s(x, y)
where x
and y
are the pointers in the input text (see below).
Example without tags:
int lex(const char *str) {
const char *YYCURSOR = str;
/*!re2c
re2c:define:YYCTYPE = char;
re2c:yyfill:enable = 0;
number = [0-9]+;
number {
// just print
printf("number: %.*s\n", (int)(YYCURSOR - str), str);
// save into an std::string
std::string s(str, YYCURSOR);
return 1;
}
* { return 0; }
*/
}
With tags:
int lex(const char *YYCURSOR) {
const char *x, *y;
/*!stags:re2c format = 'const char *@@;\n'; */
/*!re2c
re2c:define:YYCTYPE = char;
re2c:yyfill:enable = 0;
re2c:flags:tags = 1;
number = [0-9]+;
@x number @y {
// just print
printf("number: %.*s\n", (int)(y - x), x);
// save into an std::string
std::string s(x, y);
return 1;
}
* { return 0; }
*/
}
Also, what document are you referring to? I don't think re2c docs mention yytext
.
Also, what document are you referring to? I don't think re2c docs mention yytext.
Please ignore the question, I misread your initial comment as "the document does provide".
skvadrik, thank you very much, yet I am not very clear about YYCURSOR. for example:
#include<iostream>
int lex(const char *str) {
const char *YYCURSOR = str;
const char *begin = nullptr;
for(;;)
{
begin = YYCURSOR;
/*!re2c
re2c:define:YYCTYPE = char;
re2c:yyfill:enable = 0;
number = [0-9]+;
spaces = [ \t]+;
number { std::string s(begin, YYCURSOR); printf("token = [%s], size = %d\n",s.data(),s.size());}
spaces { std::string s(begin, YYCURSOR); printf("token = [%s], size = %d\n",s.data(),s.size());}
* { return -1; }
*/
}
return 0;
}
int main()
{
std::string str{ "1234 456" };
lex(str.data());
return 0;
}
the result output is not full correct, the third token has error:
You have to add continue;
at the end of semantic actions (after printf
). Otherwise the lexer just falls through into the next state, whatever it might be. Also s.c_str()
to get the C string from an std::string
is more conventional.
I have struggled with these problems for half a day, finally it is solved by your help, thank you again and best wishes to you.
I want to get the matched token string, for example:
the document seems doesn't provide this api or a convenient way to get the matched token string, so what is the best way to get matched token string or I must using " @stag" ? thank you.